10,000 Matching Annotations
  1. Apr 2025
    1. eLife Assessment

      The study investigates an emerging research field: the interaction between sleep and development. The authors use Drosophila larvae sleep as a study model and provide valuable insight into how neuropeptide circuitry controls larvae sleep. By using a broad range of behaviour and imaging methods and analysis, the authors conclude a sleep regulatory neural pathway of Hugin-PK2-Dilps in the Drosophila neurosecretory centre IPC. However, the evidence that supports this pathway is incomplete - in particular, the methodology in sleep measurement and the specificity at each step of the Hugin-PK2-Dilps pathway require further clarifying experiments or explanation.

    2. Reviewer #1 (Public review):

      Summary:

      The study investigates how neuropeptidergic signaling affects sleep regulation in Drosophila larvae. The authors first conduct a screen of CRISPR knock-out lines of genes encoding enzymes or receptors for neuropeptides and monoamines. As a result of this screen, the authors follow up on one hit, the hugin receptor, PK2-R1. They use genetic approaches, including mutants and targeted manipulations of PK2-R1 activity in insulin-producing cells (IPCs) to increase total sleep amounts in 2nd instar larvae. Similarly, dilp3 and dilp5 null mutants and genetic silencing of IPCs show increases in sleep. The authors also show that hugin mutants and thermogenetic/optogenetic activation of hugin-expressing neurons caused reductions in sleep. Furthermore, they show through imaging-based approaches that hugin-expressing neurons activate IPCs. A key finding is that wash-on of hugin peptides, Hug-γ and PK-2, in ex vivo brain preparations activates larval IPCs, as assayed by CRTC::GFP imaging. The authors then examine how the PK2-R1, hugin, and IPC manipulations affect adult sleep. Finally, the authors examine how Ca2+ responses through CRTC::GFP imaging in adult IPCs are influenced by the wash-on of hugin peptides. The conclusions of this paper are somewhat well supported by data, but some aspects of the experimental approach and sleep analysis need to be clarified and extended.

      Strengths:

      (1) This paper builds on previously published studies that examine Drosophila larval sleep regulation. Through the power of Drosophila genetics, this study yields additional insights into what role neuropeptides play in the regulation of Drosophila larval sleep.

      (2) This study utilizes several diverse approaches to examine larval and adult sleep regulation, neural activity, and circuit connections. The impressive array of distinct analyses provides new understanding into how Drosophila sleep-wake circuitry in regulated across the lifespan.

      (3) The imaging approaches used to examine IPC activation upon hugin manipulation (either thermogenetic activation or wash-on of peptides) demonstrate a powerful approach for examining how changes in neuropeptidergic signaling affect downstream neurons. These experiments involve precise manipulations as the authors use both in vivo and ex vivo conditions to observe an effect on IPC activity.

      Weaknesses:

      Although the paper does have some strengths in principle, these strengths are not fully supported by the experimental approaches used by the authors. In particular:

      (1) The authors show total sleep amount over an 18-hour period for all the measures of 2nd instar larval sleep throughout the paper. However, published studies have shown that sleep changes over the course of 2nd instar development, so more precise time windows are necessary for the analyses in this study.

      (2) Previously published reports of sleep metrics in both Drosophila larvae and adults include the average number of sleep episodes (bout number) and the average length of sleep episodes (bout length). Neither of these metrics is included in the paper for either the larval sleep or adult sleep data. Not including these metrics makes it difficult for readers to compare the findings in this study to previously published papers in the established Drosophila sleep literature.

      (3) Because Drosophila adult & larval sleep is based on locomotion, the authors need to show the activity values for the experiments supporting their key conclusions. They do show travel distances in Figure 2 - Figure Supplement 1, however, it is not clear how these distances were calculated or how the distances relate to the overall activity of individual larvae during sleep experiments. It is also concerning that inactivation of the PK2-R1-expressing neurons causes a reduction in locomotion speed. This could partially explain the increase in sleep that they observe.

      (4) The authors rely on homozygous mutant larvae and adult flies to support many of their conclusions. They also rely on Gal4 lines with fairly broad expression in the Drosophila brain to support their conclusions. Adding more precise tissue-specific manipulations, including thermogenetic activation and inhibition of smaller populations of neurons in the study would be needed to increase confidence in the presented results. Similarly, demonstrating that larval development and feeding are not affected by the broad manipulations would strengthen the conclusions.

      (5) Many of the experiments presented in this study would benefit from genetic and temperature controls. These controls would increase confidence in the presented results.

      (6) The authors claim that their findings in larvae uncover the circuit basis for larval sleep regulation. However, there is very little comparison to published studies demonstrating that neuropeptides like Dh44 regulate larval sleep. Because hugin-expressing neurons have been shown to be downstream of Dh44 neurons, the authors need to include this as part of their discussion. The authors also do not explain why other neuropeptides in the initial screen are not pursued in the study. Given the effect that these manipulations have on larval sleep in their initial screen, it seems likely that other neuropeptidergic circuits regulate larval sleep.

    3. Reviewer #2 (Public review):

      Summary:

      This study examines larval sleep patterns and compares them to sleep regulation in adult flies. The authors demonstrate hallmark sleep characteristics in larvae, including sleep rebound and increased arousal thresholds. Through genetic and behavioral analyses, they identify PK2-R1 as a key receptor involved in sleep modulation, likely via the HuginPC-IPC signaling pathway. Loss of PK2-R1 results in increased sleep, which aligns with previous findings in hugin knockout mutants. While the study presents significant contributions to the field, further investigation is needed to address discrepancies with earlier research and strengthen mechanistic claims.

      Strengths:

      (1) The study explores a relatively understudied aspect of sleep regulation, focusing on larval development.

      (2) The use of an automated behavioral measurement system ensures precise quantification of sleep patterns.

      (3) The findings provide strong genetic and behavioral evidence supporting the role of the HuginPC-IPC pathway in sleep regulation.

      (4) The study has broader implications for understanding the evolution and functional divergence of sleep circuits.

      Weaknesses:

      (1) The manuscript does not sufficiently discuss previous studies, particularly concerning hugin mutants and their metabolic effects.

      (2) The specificity of IPC secretion mechanisms is unclear, particularly regarding potential indirect effects on Dilp2.

      (3) Alternative circuits, such as the HuginPC-DH44 pathway, require further consideration.

      (4) Functional connectivity between HuginPC neurons and IPCs is not directly validated.

      (5) Developmental differences in sleep regulatory mechanisms are not thoroughly examined.

    4. Reviewer #3 (Public review):

      Summary:

      Sleep affects cognition and metabolism, evolving throughout development. In mammals, infants have fast sleep-wake cycles that stabilize in adults via circadian regulation. In this study, the author performed a genetic screen for neurotransmitters/peptides regulating sleep and identified the neuropeptide Hugin and its receptor PK2-R1 as essential components for sleep in Drosophila larvae. They showed that IPCs express Pk2-R1 and silencing IPCs resulted in a significant increase in the sleep amount, which was consistent with the effect they observed in PK2-R1 knock-out mutants. They also showed that Hugin peptides, secreted by a subset of Hugin neurons (Hug-PC), activate IPCs through the PK2-R1 receptor. This activation prompts IPCs to release insulin-like peptides (Dilps), which are implicated in the modulation of sleep. They showed that Hugin peptides induce a PK2-R1 dependent calcium (Ca²⁺) increase in IPCs, which they linked to the release of Dilp3, showing a connection between Hugin signaling to IPCs, Dilp3 release, and sleep regulation. Additionally, the activation of Hug-PC neurons reduced sleep amounts, while silencing them had the opposite effect. In contrast to the larval stage, the Hugin/PK2-R1 axis was not critical for sleep regulation in Drosophila adults, suggesting that this neuropeptidergic circuitry has divergent roles in sleep regulation across different stages of development.

      Strengths:

      This study used an updated system for sleep quantification in Drosophila larvae, and this method allowed precise measurement of larval sleep patterns which is essential for the understanding of sleep regulation.

      The authors performed unbiased genetics screening and successfully identified novel regulators for larval sleep, Hugin and its receptor PK2-R1, making a substantial contribution to the understanding of neuropeptidergic control of sleep regulation.

      They clearly demonstrated the mechanism by which Hugin-expressing neurons influence sleep through the activation of IPCs via PK2-R1 with Ca2+ responses and can modulate sleep.

      Based on the demonstrated activation of PK2-R1 by the human Hugin orthologue Neuromedin U, research on human sleep disorders may benefit from the discoveries from Drosophila since sleep-regulating mechanisms are conserved across species.

      Weaknesses:

      The study primarily focused on sleep regulation in Drosophila larvae, showing that the Hugin/PK2-R1 axis is critical for larval sleep but not necessary for adult sleep. The effects of the Hugin axis in the adult are, however, incompletely explained and somewhat inconsistent. PK2-R1 knockout adults also display increased sleep, as does HugPC silencing, at least for daytime sleep. The difference lies in Dilp3/5 mutant animals showing decreased sleep and IPCs seemingly responding with reduced Dilp3 release to PK-2 treatment (Figure 6). It seems difficult to reconcile the author's conclusions regarding this point without additional data. It could be argued that PK2-R1 still regulates adult sleep, but not via Hugin and IPCs/Dilps.

      Another issue might be that the authors show relative sleep levels for adults using Trikinetics monitoring. From the methods, it is not clear if the authors backcrossed their line to an isogenic wild-type background to normalize for line-specific effects on sleep. Thus, it is likely that each line has differences in total sleep time due to background effects, e.g., their Kir2.1 control line showed reduced sleep relative to the compared genotypes. This might limit the conclusions on the role of Hugin/PK2-R1 on adult sleep.

    1. eLife Assessment

      This valuable study presents an alternative platform for nanobody discovery using phage-displayed synthetic libraries. The evidence supporting the platform is compelling, which is used to isolate and validate nanobodies targeting Drosophila secreted proteins. By making this library openly accessible, this provides an excellent resource to the wider scientific community. The detailed protocol used in this manuscript, associated with various methods for nanobody screening, provides an alternative and reliable platform for nanobody discovery.

    2. Reviewer #1 (Public review):

      Summary:

      Using highly specific antibody reagents for biological research is of prime importance. In the past few years, novel approaches have been proposed to gain easier access to such reagents. This manuscript describes an important step forward toward the rapid and widespread isolation of antibody reagents. Via the refinement and improvement of previous approaches, the Perrimon lab describes a novel phage-displayed synthetic library for nanobody isolation. They used the library to isolate nanobodies targeting Drosophila secreted proteins. They used these nanobodies in immunostainings and immunoblottings, as well as in tissue immunostainings and live cell assays (by tethering the antigens on the cell surface).

      Since the library is made freely available, it will contribute to gaining access to better research reagents for non-profit use, an important step towards the democratisation of science.

      Strengths:

      (1) New design for a phage-displayed library of high content.

      (2) Isolation of valuble novel tools.

      (3) Detailed description of the methods such that they can be used by many other labs.

      Weaknesses:

      My comments largely concentrate on the representation of the data in the different Figures.

    3. Reviewer #2 (Public review):

      Summary:

      In this study, the authors propose an alternative platform for nanobody discovery using a phage-displayed synthetic library. The authors relied on DNA templates originally created by McMahon et al. (2018) to build the yeast-displayed synthetic library. To validate their platform, the authors screened for nanobodies against 8 Drosophila secreted proteins. Nanobody screening has been performed with phage-displayed nanobody libraries followed by an enzyme-linked immunosorbent assay (ELISA) to validate positive hits. Nanobodies with higher affinity have been tested for immunostaining and immunoblotting applications using Drosophila adult guts and hemolymph, respectively.

      Strengths:

      The authors presented a detailed protocol with various and complementary approaches to select nanobodies and test their application for immunostaining and immunoblotting experiments. Data are convincing and the manuscript is well-written, clear, and easy to read.

      Weaknesses:

      On the eight Drosophila secreted proteins selected to screen for nanobodies, the authors failed to identify nanobodies for three of them. While the authors mentioned potential improvements of the protocol in the discussion, none of them have been tested in this manuscript.

      The same comment applies to the experiments using membrane-tethered forms of the antigens to test the affinity of nanobodies identified by ELISA. Many nanobodies fail to recognize the antigens. While authors suggested a low affinity of these nanobodies for their antigens, this hypothesis has not been tested in the manuscript.

      Improving the protocol at each step for nanobody selection would greatly increase the success rate for the discovery of nanobodies with high affinity.

    1. eLife Assessment

      This valuable study determines the functional requirements for localization and activity of S. cerevisiae septin-associated kinases using in vivo imaging, in vitro and in vivo protein-protein interaction assays, and an instructive in vivo "tethering" approach. In addition to confirming previous results, the study offers evidence that the septin-associated kinases may directly interact with the contractile ring machinery. Although the experiments appear to have been conducted correctly, the quantitative analysis of some experiments is incomplete and should be improved to strengthen the conclusions.

    2. Reviewer #1 (Public review):

      Summary:

      The authors wanted to better understand how the various septin-associated kinases contribute to septin organization and function in budding yeast. This question has been recently addressed by similar kinds of studies but there are still some open questions, particularly as regards to what extent the kinases may interact with and/or modify components of the contractile ring that drives cytokinesis.

      Strengths:

      This study uses sensitive imaging with good temporal and spatial resolution to monitor the localization of various proteins in living cells. Particularly informative is the use of a GFP/GFP-binding-protein "tethering" approach to ask if the requirement for one protein can be bypassed by physically tethering another protein to a third protein. Results from a yeast two-hybrid assay for measuring protein-protein interactions in vivo are buttressed by direct in vitro binding assays using purified proteins, which is important given the likelihood of "bridging" interactions between yeast proteins in the two-hybrid approach. The authors' conclusions are quite well supported by the data.

      Weaknesses:

      A control for non-specific binding is missing from the in vitro binding assay. The figures suffer sometimes from the very small text in the labels, which obscures understanding. Ultimately, while the study provides some interesting and novel insights, we still don't understand which phosphorylation events on which proteins are important for the events occurring at the molecular level, so the advance in knowledge is somewhat incremental.

    3. Reviewer #2 (Public review):

      Summary:

      In this paper, Bhojappa et al. provide insights into the function of septin-related kinases Elm1, Gin4, Hsl1, and Kcc4 in septin organization and actomyosin ring (AMR) structure and constriction. Their findings are both corroborative of and complementary to previous related studies.

      First, the authors provide a comparative analysis of the dynamic localization of these kinases at the bud neck, as well as a comparative analysis of defects in septin localization, splitting dynamics, AMR constriction rates, and cell morphology in kinase-deficient cells. They find that septin localization and splitting kinetics, as well as AMR constriction rates, are significantly perturbed in elm1∆ and gin4∆ mutants but remain largely unaffected in hsl1∆ and kcc4∆. A similar trend is observed in terms of cell morphology and viability.

      Next, the authors focus on elm1∆ and gin4∆ cells, demonstrating that the residence time of the F-BAR protein Hof1 is significantly increased and defective in these mutants. Using yeast two-hybrid (Y2H) and in vitro binding assays, they show that the KA1 domain of Gin4 interacts with the F-BAR domain of Hof1, which may explain the cytokinesis-related functions of Elm1 and Gin4. Supporting this, they find that Gin4's role in septin localization, AMR constriction kinetics, and Hof1 bud neck localization is kinase-independent.

      The authors then conduct a series of artificial tethering experiments given their bud neck localization is mostly interdependent. They first demonstrate that artificially tethering Gin4 to the bud neck rescues the morphology defects of elm1∆ cells, with the strongest rescue observed when Gin4 was forced to interact with Hsl1-an effect that was also kinase-independent. Additionally, artificial tethering of Hsl1 to the bud neck restores the morphology of elm1∆ cells in a KA1 domain-dependent manner, suggesting that Hsl1 functions downstream of Elm1 to maintain normal cell morphology. Consistently, artificial tethering of Elm1 to the bud neck in gin4∆ cells rescues morphology defects, as well as defects in Myo1 localization and AMR constriction, but only in the presence of full-length Hsl1. The rescue fails in the absence of Hsl1 or when using a version of Hsl1 lacking the KA1 domain, which supports the role of Hsl1 downstream to Elm1 in cytokinesis.

      Strengths

      Altogether, this study offers valuable insights into the mode of cytokinesis regulation mediated by the septin-related kinases, mainly Elm1, Gin4, and Hsl1, and would be an important contribution to the field of septins and cytokinesis after addressing current weaknesses.

      Weaknesses

      (1) When assessing rescue of the elm1∆ phenotype, it needs to become clearer whether only morphology or also cytokinesis and septin organization are rescued.

      (2) The quantification of the microscopy data does not always match up with the example images, and it's not always clear how the authors quantitatively analyzed their data.

      (3) The forced tethering data are key to the paper, but the lack of a summarizing table makes it difficult to grasp the full picture.

      (4) Novel results and those confirming earlier results could be better distinguished.

    4. Reviewer #3 (Public review):

      Summary:

      The study by Bhojappa et al. brings new and interesting elements about the stability of the septin ring and the crosstalk between septin and actomyosin ring assemblies. The study focuses on the four kinases associated with the septin ring, Elm1p, Gin4p, Hsl1p, and Kcc4p. Elm1 and Gin4 show strong knock-out phenotypes, whereas Hsl1p and Kcc4p show weak knock-out phenotypes. The Elm1p/Kccp1p and Gin4p/Hsl1p pairs show similar timing at the bud neck. While these kinases share redundant functions, Gin4 appears to have a unique interaction with the BAR domain protein Hof1, revealing a novel direct interaction between the septin and actomyosin rings. Interestingly, the kinase activity of Gin4 is not required for its role in septin organisation and AMR constriction. The last part of the manuscript shows an original protein tethering protocol used to show that Hsl1 and its membrane binding ability are required for phenotype rescue of gin4null cells.

      Strengths:

      The combination of genetics, cell imaging, and biochemical characterization of protein-protein interactions is attractive.

      Weaknesses:

      (1) Imaging and data analysis is the main weakness of this manuscript. The authors must avoid manual counting and selection when easy analysis software can be used to limit bias. Instead of presenting unclear statistics of "percentage phenotypes", they need to define clear metrics to offer meaningful phenotype analysis.

      (2) This manuscript examines a very complex mechanism with four kinases of overlapping function using new data and existing literature. A clearer picture/model at the end of the manuscript that synthesizes the current knowledge would be beneficial.

    1. eLife Assessment

      This important study presents single-unit activity collected during model-based (MB) and model-free (MF) reinforcement learning in non-human primates. The dataset was carefully collected, and the statistical analyses, including the modeling, are rigorous. The evidence convincingly supports different roles for particular cortical and subcortical areas in representing key variables during reinforcement learning.

    2. Reviewer #1 (Public review):

      Summary:

      Using single-unit recording in 4 regions of non-human primate brains, the authors tested whether these regions encode computational variables related to model-based and model-free reinforcement learning strategies. While some of the variables seem to be encoded by all regions, there is clear evidence for stronger encoding of model-based information in the anterior cingulate cortex and caudate.

      Strengths:

      The analyses are thorough, the writing is clear, and the work is well-motivated by prior theory and empirical studies.

      Weaknesses:

      My comments here are quite minor.

      The correlation between transition and reward coefficients is interesting, but I'm a little worried that this might be an artifact. I suspect that reward probability is higher after common transitions, due to the fact that animals are choosing actions they think will lead to higher reward. This suggests that the coefficients might be inevitably correlated by virtue of the task design and the fact that all regions are sensitive to reward. Can the authors rule out this possibility (e.g., by simulation)?

      The explore/exploit section seems somewhat randomly tacked on. Is this really relevant? If yes, then I think it needs to be integrated more coherently.

    3. Reviewer #2 (Public review):

      Summary:

      The authors investigate single-neuron activity in rhesus macaques during model-based (MB) and model-free (MF) reinforcement learning (RL). Using a well-established two-step choice task, they analyze neural correlates of MB and MF learning across four brain regions: the anterior cingulate cortex (ACC), dorsolateral PFC (DLPFC), caudate, and putamen. The study provides strong evidence that these regions encode distinct RL-related signals, with ACC playing a dominant role in MB learning and caudate updating value representations after rare transitions. The authors apply rigorous statistical analyses to characterize neural encoding at both population and single-neuron levels.

      Strengths:

      (1) The research fills a gap in the literature, which has been limited in directly dissociating MB vs. MF learning at the single unit level and across brain areas known to be involved in reinforcement learning. This study advances our understanding of how different brain regions are involved in RL computations.

      (2) The study used a two-step choice task Miranda et al., (2020), which was previously established for distinguishing MB and MF reinforcement learning strategies.

      (3) The use of multiple brain regions (ACC, DLPFC, caudate, and putamen) in the study enabled comparisons across cortical and subcortical structures.

      (4) The study used multiple GLMs, population-level encoding analyses, and decoding approaches. With each analysis, they conducted the appropriate controls for multiple comparisons and described their methods clearly.

      (5) They implemented control regressors to account for neural drift and temporal autocorrelation.

      (6) The authors showed evidence for three main findings:<br /> a) ACC as the strongest encoder of MB variables from the four areas, which emphasizes its role in tracking transition structures and reward-based learning. The ACC also showed sustained representation of feedback that went into the next trial.<br /> b) ACC was the only area to represent both MB and MF value representations.<br /> c) The caudate selectively updates value representations when rare transitions occur, supporting its role in MB updating.

      (7) The findings support the idea that MB and MF reinforcement learning operate in parallel rather than strictly competing.

      (8) The paper also discusses how MB computations could be an extension of sophisticated MF strategies.

      Weaknesses: o

      (1) There is limited evidence for a causal relationship between neural activity and behavior. The authors cite previous lesion studies, but causality between neural encoding in ACC, caudate, and putamen and behavioral reliance on MB or MF learning is not established.

      (2) There is a heavy emphasis on ACC versus other areas, but it is unclear how much of this signal drives behavior relative to the caudate.

      (3) The role of the putamen is somewhat underexplored here.

      (4) The authors mention the monkeys were overtrained before recording, which might have led to a bias in the MB versus MF strategy.

      (5) The GLM3 model combines MB and MF value estimates but does not clearly mention how hyperparameters were optimized to prevent overfitting. While the hybrid model explains behavior well, it does not clarify whether MB/MF weighting changes dynamically over time.

      (6) It was unclear from the task description whether the images used changed periodically or how the transition effect (e.g., in Figure 3) could be disambiguated from a visual response to the pair of cues.

    1. eLife Assessment

      This is an important study that connects the polymerase-associated factor 1 complex (Paf1C) with Histone 2B monoubiquitination and the expression of genes key to virulence in Cryptococcus neoformans. The provided information is convincing and has the potential to open several opportunities to further understand the basic biology of this significant human fungal pathogen.

    2. Reviewer #1 (Public review):

      In the manuscript entitled "Rtf1 HMD domain facilitates global histone H2B monoubiquitination and regulates morphogenesis and virulence in the meningitis-causing pathogen Cryptococcus neoformans" by Jiang et al., the authors employ a combination of molecular genetics and biochemical approaches, along with phenotypic evaluations and animal models, to identify the conserved subunit of the Paf1 complex (Paf1C), Rtf1, and functionally characterize its critical roles in mediating H2B monoubiquitination (H2Bub1) and the consequent regulation of gene expression, fungal development, and virulence traits in C. deneoformans or C. neoformans. Specially, the authors found that the histone modification domain (HMD) of Rtf1 is sufficient to promote H2B monoubiquitination (H2Bub1) and the expression of genes related to fungal mating and filamentation, and restores the fungal morphogenesis and pathogenicity defects caused by RTF1 deletion. These findings highlight the critical contribution of Rtf1's HMD to epigenetic regulation and cryptococcal virulence. This work will be of interest to fungal biologists and medical mycologists, particularly those studying fungal epigenetic regulation and fungal morphogenesis.

      Comments on revisions:

      The revised manuscript addresses all my previous concerns satisfactorily.

    3. Author response:

      The following is the authors’ response to the original reviews

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In the manuscript entitled "Rtf1 HMD domain facilitates global histone H2B monoubiquitination and regulates morphogenesis and virulence in the meningitis-causing pathogen Cryptococcus neoformans" by Jiang et al., the authors employ a combination of molecular genetics and biochemical approaches, along with phenotypic evaluations and animal models, to identify the conserved subunit of the Paf1 complex (Paf1C), Rtf1, and functionally characterize its critical roles in mediating H2B monoubiquitination (H2Bub1) and the consequent regulation of gene expression, fungal development, and virulence traits in C. deneoformans or C. neoformans. Specially, the authors found that the histone modification domain (HMD) of Rtf1 is sufficient to promote H2B monoubiquitination (H2Bub1) and the expression of genes related to fungal mating and filamentation, and restores the fungal morphogenesis and pathogenicity defects caused by RTF1 deletion.

      Strengths:

      The manuscript is well-written and presents the findings in a clear manner. The findings are interesting and contribute to a better understanding of Rtf1-mediated epigenetic regulation of fungal morphogenesis and pathogenicity in a major human fungal pathogen, and potentially in other fungal species, as well.

      Weaknesses:

      A major limitation of this study is the absence of genome-wide information on Rtf1-mediated H2B monoubiquitination (H2Bub1), as well as a lack of detail regarding the function of the Plus3 domain. Although overexpression of HMD in the rtf1Δ mutant restored global H2Bub1 levels, it did not rescue certain critical biological functions, such as growth at 39 °C and melanin production (Figure 4C-D). This suggests that the precise positioning of H2Bub1 is essential for Rtf1's function. A comprehensive epigenetic landscape of H2Bub1 in the presence of HMD or full-length Rtf1 would elucidate potential mechanisms and shed light on the function of the Plus3 domain.

      We thank the reviewer (and other reviewers) for this excellent suggestion. We have conducted CUT&Tag assays with WT, _rtf1_Δ mutant, and complementary strains with the full length Rtf1 and only HMD domain cultured under 30 and 39 °C. We indeed found that the epigenetic landscape of H2Bub1 in the presence of HMD or full-length Rtf1 has variations. This results strongly suggest that the distribution of H2Bub1 is regulated by Rtf1, and H2B modifications at specific loci in the chromosome may contribute to thermal tolerance in C. neoformans. These new findings from CUT&Tag assays shed lights on understanding the mechanism of thermal tolerance, and we decided not to include these results in the current manuscript.

      Reviewer #2 (Public Review):

      Summary:

      The authors set out to determine the role of Rtf1 in Cryptococcal biology, and demonstrate that Rtf1 acts independently of the Paf1 complex to exert regulation of Histone H2B monoubiquitylation (H2Bub1). The biological impact of the loss of H2Bub1 was observed in defects in morphogenesis, reduced production of virulence factors, and reduced pathogenic potential in animal models of cryptococcal infection.

      Strengths:

      The molecular data is quite compelling, demonstrating that the Rtf1-depednent functions require only this histone modifying domain of Rtf1, and are dependent on nuclear localization. A specific point mutation in a residue conserved with the Rtf1 protein in the model yeast demonstrates the conservation of that residue in H2Bub1 modification. Interestingly, whereas expression of the HMD alone suppressed the virulence defect of the rtf1 deletion mutant, it did not suppress defects in virulence factor production.

      Weaknesses:

      The authors use two different species of Cryptococcus to investigate the biological effect of Rtf1 deletion. The work on morphogenesis utilized C. deneoformans, which is well-known to be a robust mating strain. The virulence work was performed in the C. neoformans H99 background, which is a highly pathogenic isolate. The study would be more complete if each of these processes were assessed in the other strain to understand if these biological effects are conserved across the two species of Cryptococcus. H99 is not as robust in morphogenesis, but reproducible results assessing mating and filamentation in this strain have been performed. Similarly, C. deneoformans does produce capsule and melanin.

      We thank the reviewer for the suggestion. We have conducted assays to quantify both capsule and melanin production in both C. neoformans and C. deneoformans strain background. We found that capsule production was affected in the same pattern in these two serotypes. Interestingly, we found the cell size was significantly affected by deletion of RTF1 in both serotypes. In addition, melanin production was reduced due to the deletion of RTF1 in both serotypes; However, complementation with Plus3 or mutated alleles of HMD gave different phenotypes in these two serotypes. These new findings were included Figure 4 in the revised manuscript.

      There are some concerns with the conclusions related to capsule induction. The images reported in Figure B are purported to be grown under capsule-inducing conditions, yet the H99 panel is not representative of the induced capsule for this strain. Given the lack of a baseline of induction, it is difficult to determine if any of the strains may be defective in capsule induction. Quantification of a population of cells with replicates will also help to visualize the capsular diversity in each strain population.

      We thank the reviewer for raising this concern. We have tested capsule production under capsule-inducing condition on 10% fetal bovine serum (FBS) agar medium [1]. Under this condition, the capsule layers surrounding the cells were obvious. We also included noncapsule-producing control in our assay to help the visualization of capsule. In addition, we quantified the ratio between diameters of capsule layer and cell body to show the capsular diversity in each strain population. The results were included in the Figure 4 in the revised manuscript.

      The authors demonstrate that for specific mating-related genes, the expression of the HMD recapitulated the wild-type expression pattern. The RNA-seq experiments were performed under mating conditions, suggesting specificity under this condition. The authors raise the point in the discussion that there may be differences in Rtf1 deposition on chromatin in H99, and under conditions of pathogenesis. The data that overexpression of HMD restores H2Bub1 by western is quite compelling, but does not address at which promoters H2Bub1 is modulating expression under pathogenesis conditions, and when full-length Rtf1 is present vs. only the HMD.

      We thank the reviewer for raising these concerns. Please see our response to Reviewer #1.

      Reviewer #3 (Public Review):

      Summary:

      In this very comprehensive study, the authors examine the effects of deletion and mutation of the Paf1C protein Rtf1 gene on chromatin structure, filamentation, and virulence in Cryptococcus.

      Strengths:

      The experiments are well presented and the interpretation of the data is convincing.

      Weaknesses:

      Yet, one can be frustrated by the lack of experiments that attempt to directly correlate the change in chromatin structure with the expression of a particular gene and the observed phenotype. For example, the authors observed a strong defect in the expression of ZNF2, a known regulator of filamentation, mating, and virulence, in the rtf1 mutant. Can this defect explain the observed phenotypes associated with the RTF1 mutation? Is the observed defect in melanin production associated with altered expression of laccase genes and altered chromatin structure at this locus?

      We completely agree with the reviewer. We have conducted CUT&Tag assay, and checked the Rtf1-mediated H2Bub1 at these particular gene loci. We found that the distribution of H2Bub1 at the promoter region of ZNF2 and the gene body of laccase-encoding gene varied possibly due to RTF1 mutation. We would like to save those preliminary findings for another story and not to include in this manuscript as we mentioned in the response to Reviewer #1.

      (1) Jang, E.-H., et al., Unraveling Capsule Biosynthesis and Signaling Networks in Cryptococcus neoformans. Microbiology Spectrum, 2022. 10(6): p. e02866-22.

    1. eLife Assessment

      Goswami and colleagues used rod-specific Gls1 (the gene encoding glutaminase 1) knockout mice to investigate the role of GLS1 in photoreceptor health when GLS1 was deleted from developing or adult photoreceptor cells. This study is fundamental as it shows the critical role of glutamine catabolism in photoreceptor cell health using in vivo model systems. The evidence supporting the authors' claims is compelling. The studies add new insight into how specific metabolites support vision.

    2. Reviewer #1 (Public review):

      Summary:

      The authors show for the first time that deleting GLS from rod photoreceptors results in the rapid death of these cells. The death of photoreceptor cells could result from loss of synaptic activity because of a decrease in glutamate, as has been shown in neurons, changes in redox balance, or nutrient deprivation.

      Strengths:

      The strength of this manuscript is that the author shows a similar phenotype in the mice when Gls was knocked out early in rod development or the adult rod. They showed that rapid cell death is through apoptosis, and there is an increase in the expression of genes responsive to oxidative stress.

      Comments on revisions:

      The authors addressed all of my concerns in their responses to reviewers.

    3. Reviewer #2 (Public review):

      Summary:

      Photoreceptor neurons are crucial for vision, and discovering pathways necessary for photoreceptor health and survival can open new avenues for therapeutics. Studies have shown that metabolic dysfunction can cause photoreceptor degeneration and vision loss, but the metabolic pathways maintaining photoreceptor health are not well understood. This is a fundamental study that shows that glutamine catabolism is critical for photoreceptor cell health using in vivo model systems.

      Strengths:

      The data are compelling, and the consideration of potential confounding factors (such as glutaminase 2 expression) and additional experiments to examine the synaptic connectivity and inner retina added strength to this work. The authors were also careful not to overstate their claims, but to provide solid conclusions that fit the results and data provided in their study. The findings linking asparagine supplementation and the inhibition of the integrated stress response to glutamine catabolism within the rod photoreceptor cell are intriguing and innovative. Overall, the authors provide convincing data to highlight that photoreceptors utilize various fuel sources to meet their metabolic needs, and that glutamine is critical to these cells for their biomass, redox balance, function and survival.

    4. Reviewer #3 (Public review):

      Summary:

      The authors explored the role of GLS, a glutaminase, which is an enzyme catalyzes the conversion of glutamine to glutamate, in rod photoreceptor function and survival. The loss of GLS was found to cause rapid autonomous death of rod photoreceptors.

      Strengths:

      Interesting and novel phenotype. Two types of cre-lines were rigorously used to knockout Gls gene in rods. Both of the conditional knockouts led to a similar phenotype, i.e. rod death. Histology and ERG were carefully done to characterize the loss of rods over specific ages. Necessary metabolomic study was performed and appreciated. Some rescue experiments were performed, and revealed possible mechanism.

      Weaknesses:

      No major weaknesses. Mechanism of GLS-loss induced rod death could be followed up in the future, and same for GLS's role in cones. Authors have addressed all minor points raised by this reviewer.

    5. Author response:

      The following is the authors’ response to the original reviews

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The authors show for the first time that deleting GLS from rod photoreceptors results in the rapid death of these cells. The death of photoreceptor cells could result from loss of synaptic activity because of a decrease in glutamate, as has been shown in neurons, changes in redox balance, or nutrient deprivation.

      Strengths:

      The strength of this manuscript is that the author shows a similar phenotype in the mice when Gls was knocked out early in rod development or the adult rod. They showed that rapid cell death is through apoptosis, and there is an increase in the expression of genes responsive to oxidative stress.

      We thank the reviewer for their time reviewing the manuscript and their comments regarding the potential mechanism(s) by which rod photoreceptors rapidly degenerate upon knockout of GLS.

      Weaknesses:

      In this manuscript, the authors show a "metabolic dependency of photoreceptors on glutamine catabolism in vivo". However, there is a potential bias in their thinking that glutamine metabolism in rods is similar to cancer cells where it feeds into the TCA cycle. They should consider that as in neurons, GLS1 activity provides glutamate for synaptic transmission. The modest rescue shown by providing α-ketoglutarate in the drinking water suggests that glutamine isn't a key metabolic substrate for rods when glucose is plentiful. The ERG studies performed on the iCre-Glsflox/flox mice showed a large decrease in the scotopic b wave at saturating flashes which could indicate a decrease in glutamate at the rod synapse as stated by the authors. While EM micrographs of wt and iCre-Glsflox/flox mice were shown for the outer retina at p14, the synapse of the rods needs to be examined by EM.

      We agree with the reviewer that in the presence of sufficient glucose, it appears a lack of GLS-driven glutamine (Gln) catabolism does not drastically alter the levels of TCA cycle metabolites or mitochondrial function as we demonstrated in Figure 4, and supplementation with alpha-ketoglutarate improved outer nuclear layer thickness by only a small amount as observed in Figure 5e. Hence, as we stated in the Results and Discussion, at least in the mouse where Gls is selectively deleted from rod photoreceptors by crossing Gls<sup>fl/fl</sup> mice with Rho-Cre mice (Gls<sup>fl/fl</sup>; Rho-Cre<sup>+</sup>, cKO), Gln’s role in supporting the TCA cycle is not the major mechanism by which rod photoreceptors utilize Gln to suppress apoptosis.

      With regards to GLS-driven Gln catabolism providing glutamate (Glu) for synaptic transmission, we again agree with the reviewer that Glu is an important excitatory neurotransmitter, but it is also a key metabolite necessary for the synthesis of glutathione, amino acids, and proteins. As noted and discussed at length in the manuscript, a lack of GLS-driven Gln catabolism in rod photoreceptors leads to reduced levels of oxidized glutathione (Figure 4D) possibly signaling an overall reduction in the biosynthesis of glutathione as Glu is directly and indirectly responsible for its synthesis. Furthermore, Gln and GLS-derived Glu play a central role in the biosynthesis of several nonessential amino acids and proteins. To this end, we see a reduction in the level of Glu, which is the product of the GLS reaction and further confirms the loss of GLS function. We also noted a significant decrease in aspartate (Asp), which can be constructed from the carbons and nitrogens of Gln as discussed at length in the manuscript (Figure 6A). Finally, we noted a significant decrease in global protein synthesis in the cKO retina as compared to the wild-type animal as well (Figure 6E). Therefore, the data suggest that GLS-driven Gln catabolism is critical for amino acid metabolism and protein synthesis and to some degree redox balance; although, the small but statistically significant changes in oxidized glutathione, NADP/NADPH, and redox gene expression may not fully account for the rapid and complete photoreceptor degeneration observed. Future studies are necessary to shed light on the role of redox imbalance in this novel transgenic mouse model.

      Glu also plays a role in synaptic transmission, and we considered this scenario as described in Figure 1 – figure supplement 5. Here, the synaptic connectivity between photoreceptors and the inner retina did not demonstrate significant differences in the labeling of photoreceptor synaptic membranes in the outer plexiform layer nor alterations in the labeling of a key protein (Bassoon) in ribbon synapses. These data suggest that the synaptic connectivity between photoreceptors and second-order neurons was unaltered at P14 in the cKO retina, which is the time just prior to rapid photoreceptor degeneration when Glu was shown to be decreased (Figure 6A).

      With regards to the ERG changes noted in Figure 2, we agree with the reviewer that a large decrease was noted in the scotopic b-wave at P21 and P42 in the cKO. We also agree, that to obtain greater insight into these ERG changes, the ribbon synapse in EM images can be examined. The EM images shown in Figure 1 – figure supplement 4 are from P21, which coincide with the age at which the ERG changes were first noted and when significant photoreceptor degeneration has already occurred. These images were utilized to assess the ribbon synapse for the revised version of the manuscript. As now shown in Figure 1 – figure supplement 4D, ribbon synapses are intact in WT animals as denoted by the yellow boxes. Similarly, the ribbons (yellow arrows) appear structurally intact in the photoreceptors that remain in the P21 cKO retina. These results are in accordance with the lack of significant differences in the labeling of photoreceptor synaptic membranes in the outer plexiform layer as well as the lack of alterations in the labeling of a key protein (Bassoon) in ribbon synapses (Figure 1-figure supplement 5A and B).  While we cannot fully rule out that the decrease in glutamate is altering synaptic transmission, our structural data suggests the synapses remain intact. These data have been added to the revised manuscript.

      However, an even larger reduction in the scotopic a-wave was noted at these ages as well. In animal models that disrupt photoreceptor synaptic function (Dick et al. Neuron. 2003; Johnson et al. J Neuroscience. 2007; Haeseleer et al. Nature Neuroscience. 2004; Chang et al. Vis Neurosci. 2006), a more negative ERG pattern is typically observed with the b-wave altered to a much larger degree than the a-wave. Additionally, in these models that disrupt photoreceptor synaptic transmission, the overall structure of the retina with respect to thickness is maintained (Dick et al. Neuron. 2003) or noted to have modest changes in the outer plexiform layer within the first two months of age with the outer nuclear layer not significantly altered until 8-10 months of age (Haeseleer et al. Nature Neuroscience. 2004). In contrast, a rapid decline in the outer nuclear layer thickness was observed in the cKO retina after P14 likely contributing to the ERG changes noted in Figure 2. Also, Gln is catabolized to Glu primarily by GLS as suggested by the approximately 50% reduction in Glu levels in the cKO retina (Figure 6A), but other enzymes are also capable of catabolizing Gln to Glu, so Glu levels in the rod photoreceptors are unlikely to be zero. Coupling this with the fact that rods are equipped with a self-sufficient Glu recollecting system at their synaptic terminals (Hasegawa et al. Neuron. 2006; Winkler et al. Vis Neurosci. 1999) and that GLS activity is at least two-fold higher in the photoreceptor inner segments, which support energy production and metabolism, than any other layer in the retina (Ross et al. Brain Res. 1987) suggests that altered synaptic transmission secondary to reduced levels of Glu likely does not account in full for the rapid and robust photoreceptor degeneration observed in the cKO retina.

      The authors note that the outer segments are shorter but they do not address whether there is a decrease in the number of cones.

      We have adjusted Figure 2E by removing the GLS staining to better highlight the secondary degeneration of cone outer segments, the main point of the Figure, as we had already shown that GLS was cleanly knocked out of rod photoreceptors in Figure 1. Furthermore, qualitatively the number of cones appears the same at P14, P21, and P42 between the WT and cKO, which is consistent with other retinal degeneration models, like rd1 and rd10, where cones do not begin to die until all the rods have degenerated (Xue et al. eLife. 2021).

      Rod-specific Gls ko mice with an inducible promoter were generated by crossing the Pde6g-CreERT2 and homozygous for either the WT or floxed Gls allele (IND-cKO). In Figure 3 the authors document that by western blots and antibody labeling the GLS1 expression is lost in the IND-cKO 10 days post tamoxifen. OCT images show a decrease in the thickness of the outer nuclear layer between 17 and 38 days post-TAM. Ergs should be performed on the animals at 10 and 30 days post TAM, before and after major structural changes in rod photoreceptor cells, to determine if changes in light-stimulated responses are observed. These studies could help to parse out the cause of photoreceptor cell death.

      We agree with the reviewer that the IND-cKO is a useful tool to help parse out the cause of photoreceptor cell death in this model as well as shed light on the role of GLS-driven Gln catabolism in photoreceptor synaptic transmission as discussed at length above. Hence, ERG analyses were performed 10 days post TAM, before major structural changes in the ONL are observed. Interestingly, ERG demonstrated statistically significant reductions in the IND-cKO scotopic a- and b-waves as compared to the WT 10 days post TAM. Similarly, photopic ERG demonstrated statistically significant decreases in the b-wave of the IND-cKO retina. These data suggest that GLS-driven Gln catabolism plays a significant role not only in rod photoreceptor survival but their function as well. This data has been added to Figure 3H-I and discussed in the corresponding manuscript text.

      To this end, as discussed below and added to Figure 6 – figure supplement 1, amino acid levels, including glutamate (Glu), are already reduced 10 days post TAM. Reductions in the level of Glu may impact synaptic transmission and as a result, the scotopic b-wave. However, as noted above, altered synaptic transmission secondary to reduced levels of Glu likely does not account in full for the rapid and robust photoreceptor degeneration observed in the cKO retina as the b-wave to a-wave ratio is not significantly altered in the IND-cKO retina as compared to the WT retina, suggesting GLS-driven Gln catabolism is impairing both to a similar degree.

      Additionally, Pde6g is expressed by rods to a significant degree but also by cones (GSE63473, scRNAseq data). Therefore, the IND-cKO mouse likely knocks out GLS from both rods and cones, which is in accordance with the immunofluorescence image in Figure 3B where GLS is not observed in rod or cone inner segments unlike in Figure 1B where GLS remains in cones. Hence, the reduction in photopic b-wave may be demonstrating that GLS-driven Gln catabolism in cones impairs synaptic transmission. As noted in our reply to reviewer #3’s comments, we have generated mice lacking GLS in cone photoreceptors specifically and are currently elucidating the role of GLS in cone photoreceptor metabolism, function, and survival. These results will be published in a separate manuscript.

      The studies in Figure 4 were all performed on iCre-Glsflox/flox and control mice at p14, why weren't the IND-cKO mice used for these studies since the findings would not be confounded by development?

      To gain further insight into the role of GLS-driven Gln catabolism in the maintenance of rod photoreceptors as compared to their development/maturation, we conducted a targeted metabolomic analysis on IND-cKO and WT retinas 10 days post TAM. For the purpose of this manuscript, we have included data regarding changes in amino acid levels in Figure 6 – figure supplement 1. Specifically, levels of glutamate, aspartate and asparagine are all significantly decreased in the IND-cKO retina prior to PR degeneration, which demonstrates that similar to the GLS cKO mouse (i.e. iCre-Gls flox/flox), GLS-driven Gln catabolism is critical for amino acid biosynthesis in mature rod PRs as well.

      In all rescue studies, the endpoint was an ONL thickness, which only addressed rod cell death. The authors should also determine whether there are small improvements in the ERG, which would distinguish the role of GLS in preventing oxidative stress.

      Optical coherence tomography (OCT) provides a sensitive in vivo method to detect small changes in retinal thickness without potential artifacts incurred through histological processing. Considering the Gls cKO retina demonstrates significant and rapid photoreceptor degeneration, we wanted to assess pathways that may be critical to photoreceptor survival downstream of GLS-driven Gln catabolism using rescue experiments with pharmacologic treatment or metabolite supplementation. That said, disruption of GLS-driven Gln catabolism may also significantly alter rod photoreceptor function beyond that which is secondary to photoreceptor cell death as we have demonstrated in the IND-cKO animal for the revised version of this manuscript and discussed in a response above. Therefore, the IND-cKO model provides a unique tool to assess the impact of rescue studies on photoreceptor function as the functional changes occur prior to significant degeneration. Also, unlike the GLS cKO mouse (i.e. iCre-Gls flox/flox) where photoreceptor degeneration starts very early, impairing our ability to capture reliable and robust ERG measurements, the IND-cKO mice are older at the time of functional changes allowing for robust ERG measurements. While the rate of photoreceptor degeneration in both mouse models is similar and the levels of key amino acids are altered similarly in both models, the mechanisms of cell death in developing/maturing photoreceptors may be different than that in mature photoreceptors. Hence, before we can assess if similar rescue experiments impact photoreceptor function via ERG in the IND-cKO mouse, we need to thoroughly examine how these photoreceptors are dying. These experiments and results will be published in a separate manuscript in the future.

      Reviewer #2 (Public Review):

      Summary:

      Photoreceptor neurons are crucial for vision, and discovering pathways necessary for photoreceptor health and survival can open new avenues for therapeutics. Studies have shown that metabolic dysfunction can cause photoreceptor degeneration and vision loss, but the metabolic pathways maintaining photoreceptor health are not well understood. This is a fundamental study that shows that glutamine catabolism is critical for photoreceptor cell health using in vivo model systems.

      Strengths:

      The data are compelling, and the consideration of potential confounding factors (such as glutaminase 2 expression) and additional experiments to examine the synaptic connectivity and inner retina added strength to this work. The authors were also careful not to overstate their claims, but to provide solid conclusions that fit the results and data provided in their study. The findings linking asparagine supplementation and the inhibition of the integrated stress response to glutamine catabolism within the rod photoreceptor cell are intriguing and innovative. Overall, the authors provide convincing data to highlight that photoreceptors utilize various fuel sources to meet their metabolic needs, and that glutamine is critical to these cells for their biomass, redox balance, function, and survival.

      We greatly appreciate the reviewer’s thoughtful comments and time spent reviewing this manuscript.

      Weaknesses:

      Recent studies have explored the metabolic "crosstalk" that exists within the mammalian retina, where metabolites are transferred between the various retinal cells and the retinal pigment epithelium. It would be of interest to test whether the conditional knockout mice have changes in metabolism (via qPCR such as shown in Figure 4 - Supplemental Figure 1) within the retinal pigment epithelium that may be contributing to the authors' findings in the neural retina. Additionally, the authors have very compelling data to show that inhibition of eIF2a or supplementation with asparagine can delay photoreceptor death via OCT measurements in their conditional knockout mouse model (Figure 6G, H). However, does inhibition of eIF2a or asparagine adversely impact the WT retina? It would also be impactful to know whether this has a prolonged effect, or if it is short-term, as this would provide strength to potential therapeutic targeting of these pathways to maintain photoreceptor health.

      We agree with the reviewer that metabolic communication in the outer retina is crucial to the function and survival of both photoreceptors and RPE. Therefore, we have performed qRT-PCR on eyecups from cKO and WT mice at P14, prior to photoreceptor degeneration. These data, now included in Figure 4 – figure supplement 2, show no significant changes in genes related to glycolysis, pyruvate metabolism and the TCA cycle in eyecups from cKO mice compared to WT mice at P14. The only exception is a significant decrease in Pdk4 in cKO mouse eyecups compared to WT, which was not observed in retina samples.

      Additionally, we have added data demonstrating that systemic treatment with ISRIB does not adversely impact the anatomy of the wild-type retina. Specifically, we performed OCT after 21 days of ISRIB treatment via intraperitoneal delivery in WT mice and show that total retinal, ONL and inner segment/outer segment thickness is unchanged compared to vehicle. These data are now included in Figure 6 – figure supplement 2A. We have also included data to suggest that the effect of ISRIB extends beyond P21 in the cKO mouse. This data, presented in Figure 6 – figure supplement 2B, shows that at P28, ISRIB continues to statistically significantly increase ONL thickness compared to vehicle in cKO animals.

      Reviewer #3 (Public Review):

      Summary:

      The authors explored the role of GLS, a glutaminase, which is an enzyme that catalyzes the conversion of glutamine to glutamate, in rod photoreceptor function and survival. The loss of GLS was found to cause rapid autonomous death of rod photoreceptors.

      Strengths:

      Interesting and novel phenotype. Two types of cre-lines were rigorously used to knockout the Gls gene in rods. Both of the conditional knockouts led to a similar phenotype, i.e. rod death. Histology and ERG were carefully done to characterize the loss of rods over specific ages. A necessary metabolomic study was performed and appreciated. Some rescue experiments were performed and revealed possible mechanisms.

      We thank the reviewer for their comments and appreciation of the methods utilized herein to address the role of GLS-driven Gln catabolism in rod photoreceptors.

      Weaknesses:

      No major weaknesses were identified. The mechanism of GLS-loss-induced rod death seems not fully elucidated by this study but could be followed up in the future, and the same for GLS's role in cones.

      We agree with the reviewer that the downstream metabolic and molecular mechanisms by which Gln catabolism impacts rod photoreceptor health are not fully elucidated. Defining these mechanisms will advance our understanding of photoreceptor metabolism and identify therapeutic targets promoting photoreceptor resistance to stress. Future studies are underway to uncover these mechanisms. Additionally, while outside the scope of the current manuscript, we have generated mice lacking GLS in cone photoreceptors specifically and are currently elucidating the role of GLS in cone photoreceptor metabolism, function, and survival. These results will be published in a separate manuscript.

      Reviewer #1 (Recommendations For The Authors):

      (1) The results could start at line 135, but the first paragraph isn't necessary. The data is published and could be referred to in the introduction.

      We appreciate the reviewer’s suggestion to shorten the beginning of the Results section; however, we believe the supplementary data, which is described in these lines, confirms the scRNAseq gene expression data, while adding GLS expression and localization data within the retina. The scRNAseq data and its publication was noted in the introduction, so we removed the sentence in line 117-119 that restates these results to shorten this section. We also reduced redundancy by removing an introductory sentence to the second Results paragraph.

      (2) "However, like other metabolically-demanding cells, recent work has demonstrated that PRs have the flexibility to utilize fuel sources beyond glucose to meet their metabolic needs (Adler et al., 2014; Du, Cleghorn, Contreras, Linton, et al., 2013; Grenell et al., 2019; Joyal et al., 2016; Xu et al., 2020)." The paper by Daniele et al. demonstrated that glucose is essential for maintaining the viability of rod photoreceptor cells.

      We thank the reviewer for highlighting published literature, which we apologetically overlooked. The reference for Daniele et al. has now been included.

      (3) "Single-cell RNA sequencing data has demonstrated that Gls is expressed throughout the human and mouse retina and much greater than Gls2 (Voigt et al., 2020). The authors should indicate the specific databases searched in Spectacle.

      We appreciate the reviewer’s attention to detail and have now included the references in the Introduction for GSE63473 from Macosko et al. and GSE142449 from Voigt et al., which were the databases we used in Spectacle to assess Gls levels in the mouse and human retina, respectively.

      References:

      (1) Macosko EZ, Basu A, Satija R, Nemesh J, Shekhar K, Goldman M, Tirosh I, Bialas AR, Kamitaki N, Martersteck EM, Trombetta JJ, Weitz DA, Sanes JR, Shalek AK, Regev A, McCarroll SA. Highly Parallel Genome-wide Expression Profiling of Individual Cells Using Nanoliter Droplets. Cell. 2015 May 21;161(5):1202-1214. doi: 10.1016/j.cell.2015.05.002. PMID: 26000488; PMCID: PMC4481139.

      (2) Voigt AP, Binkley E, Flamme-Wiese MJ, Zeng S, DeLuca AP, Scheetz TE, Tucker BA, Mullins RF, Stone EM. Single-Cell RNA Sequencing in Human Retinal Degeneration Reveals Distinct Glial Cell Populations. Cells. 2020 Feb 13;9(2):438. doi: 10.3390/cells9020438. PMID: 32069977; PMCID: PMC7072666.

      (4) The immunolabeling in Figure 2 looks like the images are overexposed, and the Gls antibody is labeling the outer segment, not just the inner segment of photoreceptors.

      We thank the reviewer for their comments regarding our immunofluorescence data. There was background staining of the outer segment in both the WT and cKO retina with decreased GLS staining in the inner segment of the cKO rod photoreceptors at P14 demonstrating loss of GLS in rod photoreceptors similar to Figure 1B.  For Figure 2E, we have provided adjusted images with PNA staining only that better represent the secondary cone degeneration that occurs in the rod photoreceptor-specific Gls cKO, which is the take home point of Figure 2E.

      (5) The authors could use a glutamate antibody to compare it to Gls KO mice as done in Davanger, S., Ottersen, O.P. and Storm-Mathisen, J. (1991), Glutamate, GABA, and glycine in the human retina: An immunocytochemical investigation. J. Comp. Neurol., 311: 483-494. https://doi.org/10.1002/cne.903110404

      We appreciate the reviewer’s suggestion to assess glutamate levels in the wild-type and Gls KO retina via antibody labeling. Our targeted metabolomics studies in Figure 6A provide quantitative evidence that glutamate, the product of the GLS-catalyzed reaction, is decreased as one would expect in that Gls KO retina. The antibody would add to these data by providing the localization of glutamate in the retina. With a rod photoreceptor-specific genetic KO, we would expect glutamate levels to be decreased in these cells. The antibody may also show that glutamate is not only decreased in the rod photoreceptor inner segment, where GLS predominates, but also in the synaptic terminal in accordance with the reviewer’s concerns regarding the impact of GLS KO on synaptic transmission. We have addressed this concern at length above, adding TEM images of the ribbon synapses in the GLS KO retina, and ERG analyses from the IND-cKO animals prior to significant degeneration. In the end, we agree with the reviewer that reduced Glu levels in the GLS cKO retina may impact synaptic transmission to a degree, but the synapses remain intact based on immunofluorescence and TEM analyses and a negative ERG pattern is not observed in the GLS cKO (i.e. iCre-Gls flox/flox) or IND-cKO mouse. As noted above, the structure of the retina in models that disrupt photoreceptor synaptic transmission is maintained (Dick et al. Neuron. 2003) or noted to have modest changes within the first two months of age with the outer nuclear layer not significantly altered until 8-10 months of age (Haeseleer et al. Nature Neuroscience. 2004). So, the impact of the reduced Glu levels on synaptic transmission in the GLS KO retina are unlikely to account in full for the rapid and profound photoreceptor degeneration observed. That said, the IND-cKO mouse, which allows us to assess photoreceptor function prior to significant degeneration unlike the GLS cKO mouse (i.e. iCre-Gls flox/flox), demonstrates GLS-driven Gln catabolism plays a significant role in photoreceptor function but still does not demonstrate a negative ERG pattern. Therefore, assessing Glu localization in this mouse model 10 days post TAM will be informative as to how GLS-driven Gln catabolism impacts photoreceptor function prior to degeneration. The IND-cKO mouse model is currently being extensively characterized for future publication.

      Reviewer #2 (Recommendations For The Authors):

      Main Concerns:

      (1) The authors checked for Gls2 compensation at P14 in the mouse retina. However, this data would be more compelling with an additional timepoint, particularly at P21 which is used in many of their figures throughout the study.

      We thank the reviewer for their suggestion. Figure 1-figure supplement 1D demonstrates no change in Gls2 gene expression at P14 between the WT and cKO retina. With regards to the reviewer’s concern, in Figure 1-figure supplement 1E of the original submission, we demonstrate that the expression of GLS2 is not increased in the cKO retina at P21 via immunofluorescence.

      (2) Recent studies have explored the metabolic "crosstalk" that exists within the mammalian retina, where metabolites are transferred between the various retinal cells and the retinal pigment epithelium. It would be compelling to see whether the cKO mice have changes in metabolism (via qPCR such as shown in Supplementary Figure 1 for Figure 4) within the RPE that may be contributing to their findings in the neural retina. Additionally, mention of this crosstalk and how it may impact their results should be added to the discussion.

      We appreciate the reviewer’s concern for metabolism changes in the RPE of Gls cKO mice. In agreement with reviewer 2, we performed qRT-PCR on eyecups from cKO and WT mice at P14, prior to photoreceptor degeneration. These data, now included in Figure 4 – figure supplement 2, show no significant changes in genes related to glycolysis, pyruvate metabolism and the TCA cycle in eyecups from cKO mice compared to WT mice at P14. The only exception is a significant decrease in Pdk4 in cKO mouse eyecups compared to WT, which was not observed in retina samples.

      (3) The authors use a tamoxifen-inducible cKO model to support their findings in developed rods. However, in Figure 3A it appears that this model has a greater reduction in GLS compared to the Rho-cre mouse model. Can the authors discuss this? Is this cre more efficient at targeting rods or is it leaky and may have affected other retinal cells?

      We thank the reviewer for pointing out this interesting result associated with using the Pde6g-Cre-ERT2 mouse line. Pde6g is expressed by rods to a significant degree but also by cones (GSE63473, scRNAseq data). Therefore, the IND-cKO mouse likely knocks out GLS from both rods and cones upon the TAM induction. To this end, the immunofluorescence image in Figure 3B shows GLS is knocked out in both rod or cone inner segments unlike in Figure 1B where GLS remains in cones when using the rod photoreceptor-specific, Gls<sup>fl/fl</sup> Rho-Cre<sup>+</sup> mouse. As such, as the astute reviewer noted, the fact that Western blot demonstrates greater reduction in GLS protein content fits with the protein being knocked out of both rods and cones. We have added this note about the mouse model in the corresponding text.

      (4) The authors have very compelling data to show that inhibition of eIF2a can delay photoreceptor death via OCT measurements in their cKO mouse model (Figure 6G). However, does ISRIB adversely impact the WT retina? WT vehicle and ISRIB should be shown. It would also be compelling to know whether this has a prolonged effect, or if it is short-term (i.e. would the effect still be present at P42)?

      We appreciate the reviewer’s comments regarding antagonizing the effects of p-eIF2a to prolong photoreceptor survival in the Gls cKO retina. As described above, we have data demonstrating systemic treatment with ISRIB does not adversely impact the anatomy of the wild-type retina (Figure 6-figure supplement 2A). Specifically, we treated WT animals with daily intraperitoneal ISRIB starting at P5 and performed OCT at P21 to show that total retinal, ONL and the inner segment/outer segment thickness is unchanged compared to vehicle-treated WT animals. Additionally, we have included data demonstrating the photoreceptor neuroprotective effect of ISRIB treatment in the Gls cKO mouse extends beyond P21 in the cKO mouse (Figure 6-figure supplement 2B).

      (5) For Figure 6H, same as point #4.

      While we have not specifically assessed potential retinal toxicity secondary to systemic Asn supplementation, oral Asn supplementation (up to 100mg/kg/day) was provided to patients for 24 months and found to be well-tolerated (PMID:31123592). Allometric scaling of this dose to the mouse would yield a mouse dose of 1234 mg/kg/day, which is much greater than the 200mg/kg/day dose provided here (PMID: 27057123). Additionally, a 90-day toxicity study of Asn in rats demonstrated a no observed adverse effect level of 1.62g/kg bodyweight/day in males and 1.73g/kg bodyweight/day in females (PMID: 18508175). The lower dose in that study equates to a mouse dose of 3.2g/kg bodyweight/day, well above the mouse dose utilized in this report. As such, future studies should focus on a dose-response relationship with Asn supplementation, and as the reviewer suggested, determining the duration of effect with Asn supplementation.

      (6) Some of the results section belongs in the introduction or discussion and can be moved.

      We have addressed the reviewer’s concern by moving some of the results to the discussion and removing statements in the results that were either noted in the Introduction or conferred in the Discussion.

      Minor Concerns:

      (1) Scale bar mentions in the figure legends use plural when only one is present, or in some cases are missing. A scale bar should be added to the OCT images if possible.

      We appreciate the reviewer’s attention to detail, and information regarding scale bars has been updated in the figure legends.

      (2) For Figures 1I and J, the sample size changes when J is a quantification of I. Please correct.

      We have corrected the sample size to be consistent between Figures 1I and J.

      (3) In Figure 1 - Figure Supplement 3 the P42 timepoint is not mentioned in the legend. Please correct.

      We have now included the P42 timepoint in the legend for in Figure 1 – Figure Supplement 3 as well as the manuscript text.

      (4) In Figure 1 - Figure Supplement 5 the wrong P value is mentioned in the legend. Please correct.

      We have corrected the P value in the legend for Figure 1 – Figure Supplement 5.

      (5) Can the authors double-check their ERG light intensity settings? They seem high. Please confirm if they are correct.

      We appreciate the reviewer’s concern for ERG light intensity settings and have confirmed the settings used in the study were 32 cd*s/m<sup>2</sup> and 100 cd*s/m<sup>2</sup> for scotopic and photopic ERG recordings, respectively.

      (6) The legend key in Figure 2A would be more helpful if the axis were present by the representative traces.

      We thank the reviewer for the suggestion of adding axes to the ERG traces. Figure 2A has been updated to reflect this modification.

      (7) Can the authors check that the error bars are present in Figure 5E?

      We appreciate the reviewer’s concern for error bars in Figure 5E, which are included in the figure. The standard error in this experiment is so small that the symbols overlap with the error bars.

      Reviewer #3 (Recommendations For The Authors):

      Suggestions for improved or additional experiments, data, or analyses.

      (1) Figure 6: ISRIB seems to give the most dramatic rescue of cKO GLS in P21 rods. Does it completely prevent rod death? i.e. What's the ONL thickness of P21 WT control? What's the ISRIB rescue of an older cKO animal, say P35?

      The ONL thickness of P21 WT control is on average 0.06 mm (Figure 1E), while the ONL thickness of the Gls cKO retina with ISRIB treatment at P21 is on average 0.044 mm. Therefore, rod death is not completely prevented with ISRIB but rather, rod photoreceptor survival is prolonged. As noted above, we have provided data to demonstrate that the photoreceptor neuroprotective effect of ISRIB lasts beyond P21 (Figure 6-figure supplement 2B).

      (2) What's the mechanistic link between ISR and GLS beyond current speculation? Does GLS have other unknown functions beyond converting glutamine to glutamate? Any novel insights from GLS protein structure?

      We thank the reviewer for this thoughtful question. It is certainly possible that GLS has other functions outside of its role in glutaminolysis. It is well known that other metabolic enzymes have moonlighting functions including hexokinase 2, which has been shown to be important in preventing intrinsic apoptosis through blocking the binding of pro-apoptotic proteins to the mitochondria. While not directly related to ISR, a single report suggests GLS functions non-canonically in Gln-deprived states, promoting mitochondrial fusion to suppress ROS production (PMID: 29934617). Investigating the moonlighting functions of metabolic enzymes is part of our ongoing research program and GLS is included in these studies.

      (3) Just curious about GLS cKO in cones. Any similar phenotype?

      We appreciate the reviewer’s curiosity regarding Gls cKO in cones and this study is currently ongoing with a poster presented at ARVO 2024 (Subramanya et al; Glutaminase-driven glutamine catabolism supports cone photoreceptor metabolism, function, and structure. Invest. Ophthalmol. Vis. Sci. 2024;65(7):193) and a manuscript in preparation. As discussed above, GLS knock out in cones likely impacts their function, in accordance with the data presented at ARVO 2024.

      Recommendations for improving the writing and presentation.

      (1) In the Discussion, lines 458-466, it's incorrect to compare the importance of glucose metabolism to GLS-dependent pathway to photoreceptors in this way. An alternative explanation: glucose metabolism is so important that the system has many redundancies, e.g. HK1 exists in addition to HK2, thus single gene KO leads to no phenotype. The only fair comparison is nutrient deprivation, e.g. taking out glucose or glutamine from retina explants (Punzo et al., 2009).

      The reviewer makes an excellent point. While we do not see an upregulation of GLS2 in the retina or rod PRs upon GLS knockout (Figure 1-figure supplement 1 D and E), loss of Gls in rod PRs does alter the expression of many metabolism-related genes (Figure 4-figure supplement 1).  We alluded to these data and the reviewer’s point in the second paragraph of the discussion: “In any of these transgenic mouse models, PRs may use other transporters to take up fatty acids or glucose or rewire their metabolism to maintain metabolic homeostasis and stave off degeneration (Subramanya et al., 2023; Wubben et al., 2017). Our data show that any metabolic reprogramming that is occurring in the cKO mouse retina appears unable to significantly circumvent the significant and rapid PR degeneration suggesting the importance of Gln catabolism in rod PRs. Furthermore, inducing GLS knockdown in mature PRs also demonstrated rapid PR degeneration (Figure 3).”

      In the revised article, we have amended these sentences to include the importance of metabolic redundancies. “In any of these transgenic mouse models, PRs may use other transporters to take up fatty acids or glucose, rewire their metabolism, or utilize metabolic redundancies to maintain metabolic homeostasis and stave off degeneration (Subramanya et al., 2023; Wubben et al., 2017). Our data show that any metabolic reprogramming that is occurring in the cKO mouse retina appears unable to significantly circumvent the significant and rapid PR degeneration suggesting the importance of Gln catabolism in rod PRs. Furthermore, inducing GLS knockdown in mature PRs also demonstrated rapid PR degeneration (Figure 3).”

      (2) Please discuss the mosaic activity of Rho-cre used in this study, as described in the original study (Le et al 2006). Line 221 (Li et al 2005) seems to be a different Rho-Cre created by a different group. Please make sure the citation is correct and consistent.

      We apologize for the confusion and have corrected the reference on line 221 to Le et al, 2006. The reviewer is correct that the original report (Le at al. 2006) demonstrated a mosaic of Cre-mediated recombination in rod photoreceptors and rod bipolar cells in the mouse line that had the shorter (0.2 kb) mouse opsin promoter-controlled Cre. In contrast, this same report showed only Cre-mediated recombination in rod photoreceptors in another line that utilized a long (4.1 kb) mouse opsin promoter-controlled Cre. We have published using this latter promoter-controlled Cre recombinase in at least 5 different mouse models (Wubben et al. 2017; Weh et al. 2020; Weh et al. 2023; Subramanya et al. 2023; the current report), and in all these models, we observe clear and consistent knockout by immunofluorescence only in rod photoreceptors with residual protein in cones and no significant change in protein expression in the INL where bipolar cells reside. Western blots confirm the reduction in protein expression.

      (3) The authors should provide representative images of retina cross-sections for key rescue data (Figure 6G&H).

      As requested by Reviewer 3, representative histology images of retina cross-sections for the ISRIB and Asn rescue experiments in Gls cKO mice at P21 are now included in the manuscript in Figure 6 – figure supplement 3.

      Minor corrections to the text and figures.

      (1) Spell out Gln in the Abstract when used for the first time.

      We have included glutamine (Gln) in the abstract upon first use.

      (2) Line 433, Figure 6G should be 6H.

      Thank you for the correction, the manuscript has been updated.

    1. eLife Assessment

      This fundamental study provides a critical challenge to a great many studies of the neural correlates of consciousness that were based on post hoc sorting of reported awareness experience. The evidence supporting this criticism is compelling, based on simulations and decoding analysis of EEG data. The results will be of interest not only to psychologists and neuroscientists but also to philosophers who work on addressing mind-body relationships.

    2. Reviewer #2 (Public review):

      Summary:

      The study investigates the potential influence of the response criterion on neural decoding accuracy in consciousness and unconsciousness, utilizing either simulated data or reanalyzing experimental data with post-hoc sorting data.

      Strengths:

      When comparing the neural decoding performance of Target versus NonTarget with or without post-hoc sorting based on subject reports, it is evident that response criterion can influence the results. This was observed in simulated data as well as in two experiments that manipulated the subject response criterion to be either more liberal or more conservative. One experiment involved a two-level response (seen vs unseen), while the other included a more detailed four-level response (ranging from 0 for no experience to 3 for a clear experience). The findings consistently indicated that adopting a more conservative response criterion could enhance neural decoding performance, whether in conscious or unconscious states, depending on the sensitivity or overall response threshold.

      The uneven distribution of trails for Target (75%) and NonTarget (25%) was identified as a potential weakness in the initial review of this study. Nevertheless, we support the authors' assertion that their analysis methodology validates comparing liberal and conservative approaches. Future investigations could further explore differences between liberal and conservative on different ratios of Target vs NonTarget, particularly when the proportion of Target matches or falls below that of NonTarget.

    3. Author response:

      The following is the authors’ response to the previous reviews

      Reviewer #1 (Public review):

      Summary:

      The study aimed to investigate the significant impact of criterion placement on the validity of neural measures of consciousness, examining how different standards for classifying a stimulus as 'seen' or 'unseen' can influence the interpretation of neural data. They conducted simulations and EEG experiments to demonstrate that the Perceptual Awareness Scale, a widely used tool in consciousness research, may not effectively mitigate criterion-related confounds, suggesting that even with the PAS, neural measures can be compromised by how criteria are set. Their study challenged existing paradigms by showing that the construct validity of neural measures of conscious and unconscious processing is threatened by criterion placement, and they provided practical recommendations for improving experimental designs in the field. The authors' work contributes to a deeper understanding of the nature of conscious and unconscious processing and addresses methodological concerns by exploring the pervasive influence of criterion placement on neural measures of consciousness and discussing alternative paradigms that might offer solutions to the criterion problem.

      The study effectively demonstrates that the placement of criteria for determining whether a stimulus is 'seen' or 'unseen' significantly impacts the validity of neural measures of consciousness. The authors found that conservative criteria tend to inflate effect sizes, while liberal criteria reduce them, leading to potentially misleading conclusions about conscious and unconscious processing. The authors employed robust simulations and EEG experiments to demonstrate the effects of criterion placement, ensuring that the findings are well-supported by empirical evidence. The results from both experiments confirm the predicted confounding effects of criterion placement on neural measures of unconscious and conscious processing.

      The results are consistent with their hypotheses and contribute meaningfully to the field of consciousness research.

      We would like to thank reviewer 1 for their positive words and for taking the time to evaluate our manuscript.

      Reviewer #2 (Public review):

      Summary:

      The study investigates the potential influence of the response criterion on neural decoding accuracy in consciousness and unconsciousness, utilizing either simulated data or reanalyzing experimental data with post-hoc sorting data.

      Strengths:

      When comparing the neural decoding performance of Target versus NonTarget with or without post-hoc sorting based on subject reports, it is evident that response criterion can influence the results. This was observed in simulated data as well as in two experiments that manipulated subject response criterion to be either more liberal or more conservative. One experiment involved a two-level response (seen vs unseen), while the other included a more detailed four-level response (ranging from 0 for no experience to 3 for a clear experience). The findings consistently indicated that adopting a more conservative response criterion could enhance neural decoding performance, whether in conscious or unconscious states, depending on the sensitivity or overall response threshold.

      Weaknesses:

      (1) In the realm of research methodology, conducting post-hoc sorting based on subject reports raises an issue. This operation leads to an imbalance in the number of trials between the two conditions (Target and NonTarget) during the decoding process. Such trial number disparity introduces bias during decoding, likely contributing to fluctuations in neural decoding performance. This potential confounding factor significantly impacts the interpretation of research findings. The trial number imbalance may cause models to exhibit a bias towards the category with more trials during the learning process, leading to misjudgments of neural signal differences between the two conditions and failing to accurately reflect the distinctions in brain neural activity between target and non-target states. Therefore, it is recommended that the authors extensively discuss this confounding factor in their paper. They should analyze in detail how this factor could influence the interpretation of results, such as potentially exaggerating or diminishing certain effects, and whether measures are necessary to correct the bias induced by this imbalance to ensure the reliability and validity of the research conclusions.

      We would like to thank reviewer 2 for their positive words and for taking the time to evaluate our manuscript. In response to this asserted weakness, we would like to point out that the issue of trial imbalances was already comprehensively addressed in the manuscript. No trial imbalances are present in the analyzed data for any of the conditions, so that none of our reported results could have been impacted by this. This was done through the following set of measures:

      (1) Training data (method section): “a linear discriminant analytic (LDA) classifier was trained for each participant using all trials from all sessions (3 sessions in Experiment 1, 2 sessions in Experiment 2) to discriminate target from no-target trials based on EEG data, irrespective of seen/unseen responses and irrespective of the response criterion. To maximize signal-to-noise ratio, we applied a leave-one-person-out cross validated decoding scheme by using all classifiers from all participants except the participants that was being tested (separately for Experiment 1 and for Experiment 2). This leave-one-person-outcross validation procedure maximized the available data for training without requiring k-foldingon subsets of cells with low response counts, so that all test sets were classified by the same fully independent classifiers. A single time series of classification performance across time was obtained for every participant (every testing set) by averaging classification performance across all classifiers that tested that set (see Methods and supplementary Figure S2 for details).”<br /> This leave-one-person-outcross validation scheme made surre that no trial selection needed to be performed to analyze conservative or liberal conditions. Both conditions were classified using the same classifier, consisting of all data from the other participants.

      (2) Testing data (methods section): “To ensure that differences resulting from post hoc sorting could not be explained by differences in signal-to-noise ratio resulting from disparities in trial counts in the testing set, we equated trial counts between the liberal and conservative condition within each participant by randomly selecting the same number of trials from overrepresented cells (for Experiment 1, this was done at the level of ‘seen’ and ‘unseen’ responses, for experiment 2 the trial counts were equated at eachof the PAS levels, see methods for details). As a result, response-contingent conditions in the liberal and conservative conditions had identical input for all classification analyses. Although different trial counts in the testing set might affect the precision with which AUC is estimated in a decoding analysis, it does not affect the size of AUC itself. Trial count equation was merely performed tomake sure the liberal and conservative condition were as comparable as possible.”

      Indeed, we also report at the end of this section that running the same analyses without selecting trials in the test set yielded qualitatively identical results: “Analyzing the data without equating trial counts resulted in qualitatively identical results.”

      To remove any lack of clarity about this, we now also briefly report in the beginning of the discussion section that the results cannot be explained by unequal trial counts:

      “We found that in both experiments, criterion shifts modulated effect size in neural measures of ‘unconscious’ (unseen) and/or ‘conscious’ (seen) processing, and that this happens even though the conservative and liberal condition used the same independent training data (identical classifiers), and even though the trial counts in the test sets were equated for the conservative and liberal condition.”

      Reviewer #3 (Public review):

      Summary:

      Fahrenfort et al. investigate how liberal or conservative criterion placement in a detection task affects the construct validity of neural measures of unconscious cognition and conscious processing. Participants identified instances of "seen" or "unseen" in a detection task, a method known as post hoc sorting. Simulation data convincingly demonstrate that, counterintuitively, a conservative criterion inflates effect sizes of neural measures compared to a liberal criterion. While the impact of criterion shifts on effect size is suggested by signal detection theory, this study is the first to address this explicitly within the consciousness literature. Decoding analysis of data from two EEG experiments further shows that different criteria lead to differential effects on classifier performance in post hoc sorting. The findings underscore the pervasive influence of experimental design and participant reports on neural measures of consciousness, revealing that criterion placement poses a critical challenge for researchers.

      Strengths and Weaknesses

      One of the strengths of this study is the inclusion of the Perceptual Awareness Scale (PAS), which allows participants to provide more nuanced responses regarding their perceptual experiences. This approach ensures that responses at the lowest awareness level (selection 0) are made only when trials are genuinely unseen. This methodological choice is important as it helps prevent the overestimation of unconscious processing, enhancing the validity of the findings.

      The authors also do a commendable job in the discussion by addressing alternative paradigms, such as wagering paradigms, as a possible remedy to the criterion problem (Peters & Lau, 2015; Dienes & Seth, 2010). Their consideration of these alternatives provides a balanced view and strengthens the overall discussion.

      Our initial review identified a lack of measures of variance as one potential weakness of this work. However we agree with the authors' response that plotting individual datapoints for each condition is indeed a good visualization of variance within a dataset.

      Impact of the Work:

      This study effectively demonstrates a phenomenon that, while understood within the context of signal detection theory, has been largely unexplored within the consciousness literature. Subjective measures may not reliably capture the construct they aim to measure due to criterion confounds. Future research on neural measures of consciousness should account for this issue, and no-report measures may be necessary until the criterion problem is resolved.

      We thank reviewer 3 for their positive words and for taking the time to evaluate our manuscript.

    1. eLife Assessment

      This is a valuable study describing how rhabdomyosarcoma fusion-oncogenes, VGLL2-NCOA2 and TEAD1-NCOA2, function at the genomic, transcriptional, and proteomic levels in multiple systems. The experimental data is convincing, supporting a model in which these fusion-oncogenes leverage TEAD transcriptional signatures independent of YAP/TAZ. This work offers new mechanistic insights into oncogenic gene fusion events and reveals potential therapeutic strategies for the treatment of rhabdomyosarcomas.

    2. Reviewer #1 (Public review):

      Guo, Hue et al., is focused on understanding the epigenetic activity and functional dependencies for two different fusions found in spindle cell rhabdomyosarcoma, VGLL2::NCOA2 and TEAD1::NCOA2. They use a variety of models and methods; specifically, ectopic expression of the fusions in human 293T cells to perform RNAseq (both fusions), CUT&RUN (VGLL2::NCOA2) and BioID mass spec (both fusions). These data identify that the VGLL2::NCOA2 fusion has peaks that are enriched for TEAD motifs. Further, CPB/p300 CUT&RUN support an enrichment of binding sites and three TEAD targets in VGLL2::NCOA2 and TEAD1::NCOA2 expressing cells. They also functionally evaluate genetic and chemical dependencies (TEAD inhibition), and found this was only effective for the VGLL2::NCOA2 fusion, and not for TEAD1::NCOA2. Using complementary biochemical approaches, they suggest (with other supporting data) the fusions regulate TEAD transcriptional outputs via a YAP/TAZ independent mechanism. Further, they expand into a C2C12 myoblast model and show that TEAD1::NCOA2 is transforming in colony formation assays and in mouse allograft. These strategies for TEAD1-NCOA2 are consistent with previous published strategies using VGLL2::NCOA2. Importantly, they show that a CBP/p300 (a binding partner found in their BioID mass spec) small molecule inhibitor suppresses tumor formation using this mouse allograft model, and that the tumors are less proliferative, and have a reduction in transcriptional of three TEAD target genes. They complement in vivo data with biochemical approaches, and suggest this interface with p300 (for VGLL2::NCOA2) is through the NCOA2 fusion partner, as Co-IP in HEK293T with a mutant fusion that does not contain NCOA2 loses the association with endogenous p300. The data is interesting and suggests new biology for these fusion-oncogenes. However, the choice of 293T may limit the broad applicability of the findings. Strikingly, in 293T there was more transcriptional overlap with the VGLL2-NCOA2 fusion with the YAP5SA mutant than with TEAD1-NCOA2. Further, there is an additional opportunity to directly compare transcriptional profiles in 293T to the human disease and in the mouse allograft system to directly compare and discuss VGLL2-NCOA2 and TEAD1-NCOA2 histological differences or how A485 treatment may change the histology. Overall, the breadth of methods used in this study, and comparison of the two fusion-oncogene's biology is of interest to the fusion-oncogene, pediatric sarcoma, and epigenetic therapeutic targeting fields.

    3. Reviewer #2 (Public review):

      In the manuscript entitled "VGLL2 and TEAD1 fusion proteins drive YAP/TAZ-independent transcription and tumorigenesis by engaging p300", Gu et al. investigated two Hippo pathway-related gene fusion events (i.e., VGLL2-NCOA2, TEAD1-NCOA2) in spindle cell rhabdomyosarcoma (scRMS). They demonstrate that these fusion proteins activate Hippo downstream gene transcription independently of YAP/TAZ. Using BioID-based mass spectrometry analysis, the authors identify histone acetyltransferase CBP/p300 as a specific binding protein for VGLL2-NCOA2 and TEAD1-NCOA2 fusion proteins. Pharmacologically targeting p300 inhibits the fusion proteins-induced Hippo downstream gene transcription and tumorigenesis.

      Overall, this work provides novel mechanistic insights into scRMS-associated gene fusions in tumorigenesis and reveals potential therapeutic targets for cancer treatment. The manuscript is well-written and easy to follow. Below are a few comments based on the revised study.

      (1) While the study majorly focuses on Hippo downstream gene transcription, a significant portion of genes regulated by the VGLL2-NCOA2 and TEAD1-NCOA2 fusion proteins are non-Hippo downstream genes (Fig. 3). Further characterization of how both Hippo and non-Hippo downstream genes contribute to fusion proteins-induced oncogenesis would enhance our understanding of scRMS etiology.

      (2) A potential limitation of this study is the reliance on overexpression approaches to investigate VGLL2-NCOA2 and TEAD1-NCOA2 fusion genes, which may not fully reflect pathological conditions in scRMS patients. Despite this, the significant study offers valuable mechanistic insights into fusion genes-induced scRMS and provides molecular foundation for developing targeted therapies.

    4. Author response:

      The following is the authors’ response to the original reviews

      Reviewer #1 (Public Review):

      (1) The rationale for performing genomics, transcriptional, and proteomics work in 293T cells is not discussed. Further, there are no functional readouts mentioned in the 293T cells with expression of the fusion-oncogenes. Did these cells have any phenotypes associated with fusion-oncogene expression (proliferation differences, morphological changes, colony formation capacity)? Further, how similar are the gene expression signatures from RNA-seq to rhabdomyosarcoma? This would help the reader interpret how similar these cell models are to human disease.

      We appreciate the reviewer’s comments and understand the limitation of HEK293T cell culture. HEK293T cells were used as a surrogate system that enabled us to systemically examine and compare the transcriptional activation mechanisms between VGLL2-NCOA2/TEAD1-NCOA2 and YAP/TAZ. HEK293T cells have previously been used as a model system to study the signaling and transcriptional mechanisms of the Hippo/YAP pathway (1,2). Our data also showed that the ectopic expression of VGLL2-NCOA2 and TEAD1-NCOA2 in HEK293 cells can promote proliferation (Figure 1-figure supplement 1B), consistent with their potential oncogenic function.

      (2) TEAD1::NCOA2 fusion-oncogene model was not credentialed past H&E, and expression of Desmin. Is the transcriptional signature in C2C12 or 293T similar to a rhabdomyosarcoma gene signature?

      We understand the reviewer’s concern. VGLL2-NCOA2 in vivo tumorigenesis model generated by C2C12 cell orthotopic transplantation has recently been reported, and it exhibits similar characteristics with zebrafish transgenic tumors as well as human scRMS samples that carry the VGLL2-NCOA2 fusion (3). Due to the similar transcriptional and oncogenic mechanisms employed by both VGLL2-NCOA2 and TEAD1-NCOA2 fusion proteins, we expect that the TEAD1-NCOA2 dependent C2C12 transplantation model will closely resemble that induced by VGLL2-NCOA2.

      (3) For the fusion-oncogenes, did the HA, FLAG, or V5 tag impact fusion-oncogene activity? Was the tag on the 3' or 5' of the fusion? This was not discussed in the methods.

      To address the reviewer’s concern, we carefully compared the transcriptional activity of the fusion proteins with the HA tag at the 5’ end or FLAG and V5 tag at the 3’ end. We found that neither the tag type nor its location significantly affects the ability of VGLL2-NCOA2 and TEAD1-NCOA2 to induce downstream gene transcription, measured by qPCR. The data is summarized in Figure 1-figure supplement 1 G-H.

      (4) Generally, the lack of details in the figures, figure legends, and methods make the data difficult to interpret. A few examples are below:

      a. Individual data points are not shown for figure bar plots (how many technical or biological replicates are present and how many times was the experiment repeated?).

      As requested, we have added the individual data points to the bar plots. The Method section now includes information on the number of biological replicates and the times the experiments were repeated.

      b. What exons were included in the fusion-oncogenes from VGLL2 and NCOA2 or TEAD1 and NCOA2?

      We have now included the exon structure organization of VGLL2-NCOA2 or TEAD1-NCOA2 fusions in Figure 1-figure supplement 1A.

      c. For how long were the colony formation experiments performed? Two weeks?

      We have included more detailed information about the colony formation assay in the Methods section.

      d. In Figure 2D, what concentration of CP1 was used and for how long?

      The CP1 concentration and treatment duration information has now been included in the figure legend and Methods section.

      e. How was A485 resuspended for cell culture and mouse experiments, what is the percentage of DMSO?

      The Methods section now includes detailed information on how A485 is prepared for in vitro and in vivo experiments.

      f. How many replicates were done for RNA-seq, CUT&RUN, and ATACseq experiments?

      RNA-seq was done with three biological replicates and CUT&RUN and ATAC-seq were performed with two biological replicates. This information is now included in the Methods section for clarification.

      Reviewer #2 (Public Review):

      In the manuscript entitled "VGLL2 and TEAD1 fusion proteins drive YAP/TAZ-independent transcription and tumorigenesis by engaging p300", Gu et al. studied two Hippo pathway-related gene fusion events (i.e., VGLL2-NCOA2, TEAD1-NCOA2) in spindle cell rhabdomyosarcoma (scRMS) and showed that their fusion proteins can activate Hippo downstream gene transcription independent of YAP/TAZ. Using the BioID-based mass spectrometry analysis, the authors revealed histone acetyltransferase CBP/p300 as specific binding proteins for VGLL2-NCOA2 and TEAD1-NCOA2 fusion proteins. Pharmacologically targeting p300 inhibited the fusion proteins-induced Hippo downstream gene transcription and tumorigenic events.

      Overall, this study provides mechanistic insights into the scRMS-associated gene fusions in tumorigenesis and reveals potential therapeutic targets for cancer treatment. The manuscript is well-written and easy to follow.

      Here, several suggestions are made for the authors to improve their study.

      Main points

      (1) The authors majorly focused on the Hippo downstream gene transcription in this study, while a significant portion of genes regulated by the VGLL2-NCOA2 and TEAD1-NCOA2 fusion proteins are non-Hippo downstream genes (Figure 3). The authors should investigate whether the altered Hippo pathway transcription is essential for VGLL2-NCOA2 and TEAD1-NCOA2-induced cell transformation and tumorigenesis. Specifically, they should test if treatment with the TEAD inhibitor can reverse the cell transformation and tumorigenesis caused by VGLL2-NCOA2 but not TEAD1-NCOA2. In addition, it is important to examine whether YAP-5SA expression can rescue the inhibitory effects of A485 on VGLL2-NCOA2 and TEAD1-NCOA2-induced colony formation and tumor growth. This will help clarify whether Hippo downstream gene transcription is important for the oncogenic activities of these two fusion proteins.

      We thank the reviewer for the comments. Although we have not tested the small molecular TEAD inhibitor on VGLL2-NCOA2 or TEAD1-NCOA2-induced cell transformation and tumorigenesis, we expect that TEAD inhibition will block VGLL2-NCOA2- but not TEAD1-NCOA2-induced oncogenic activity. It is because TEAD1-NCOA2 does not contain the auto-palmitoylation sites and the hydrophobic pocket in the C-terminal YAP-binding domain of TEAD1 that the TEAD small molecule inhibitor occupies (4). We also appreciate the reviewer’s suggestion of YAP5SA rescue experiments. However, due to its strong oncogenic activity, YAP5SA itself can induce robust downstream transcription and cell transformation with or without A485 treatment, as shown in Figure 5. Thus, it will be unlikely to address whether non-Hippo downstream genes induced by the fusions are important for cell transformation and tumorigenesis. Because of the distinct nature of transcriptional and chromatin landscapes controlled by VGLL2-NCOA2/TEAD-NCOA2 and YAP, we speculate that both Hippo and non-Hippo-related downstream genes contribute to the oncogenic activation and tumor phenotypes induced by the fusion proteins.

      (2) Rationale for selecting CBP/p300 for functional studies needs to be provided. The BioID-MS experiment identified many interacting proteins for VGLL2-NCOA2 and TEAD1-NCOA2 fusion proteins (Table S4). The authors should explain the scoring system used to identify the high-interacting proteins for VGLL2-NCOA2 and TEAD1-NCOA2 fusion proteins. Was CEP/p300 the top candidates on the list? Providing this information will help justify the focus on CBP/p300 and validate their importance in this study.

      We appreciate the reviewer’s point. CBP/P300 is among the top hits in our proteomics screens of both VGLL2-NCOA2 and TEAD1-NCOA2. Our focus on CBP/P300 is mainly due to the well-established interactions between CBP/P300 and the NCOA family transcriptional co-activators, in which the CBP/P300-NCOA complex plays a central role in mediating nuclear receptors-induced transcriptional activation (5). In addition, our data is consistent with another re-current Vgll2 fusion identified in scRMS, VGLL2-CITED2 (6) that has a C-term fusion partner from CITED2, which is a known CBP/P300 interacting protein (7).

      (3) p300 was revealed as a key driver for the VGLL2-NCOA2 and TEAD1-NCOA2 fusion proteins-induced transcriptome alteration and tumorigenesis. To strengthen the point, the authors should identify the p300 binding region on VGLL2-NCOA2 and TEAD1-NCOA2 fusion proteins. Mutants with defects in p300 binding/recruitment should be generated and included as a control in the related q-PCR and tumorigenic studies. This work will help confirm the crucial role of p300 in mediating the oncogenic effects of these two fusion proteins.

      We thank the reviewer for the suggestion. We have performed the co-immunoprecipitation assay using the deletion mutant form of VGLL2-NCOA2. We have performed additional co-immunoprecipitation experiments and demonstrated that the C-term NCOA2 part of the fusion is responsible for mediating the interaction between the fusion protein and CBP/P300. These results are now included in the new Figure 5A and are consistent with the reported structural analysis of CBP/P300-NCOA complex (8). In addition, our new data showed the inability of the VGLL2-NCOA2 ∆NCOA2 mutant to induce gene transcription (Figure 1-figure supplement 1D). Furthermore, our data using the small molecular CBP/P300 inhibitor clearly demonstrated that CBP/P300 is required to mediate cell transformation and tumorigenesis induced by the two fusion proteins in vitro and in vivo (Figure 5 and 6).

      (4) Another major issue is the overexpression system extensively used in this study. It is important to determine whether the VGLL2-NCOA2 and TEAD1-NCOA2 fusion genes are also amplified in cancer. If not, the expression levels of the VGLL2-NCOA2 and TEAD1-NCOA2 fusion proteins should be adjusted to endogenous levels to assess their oncogenic effects on gene transcription and tumorigenesis. This approach would make the study more relevant to the pathological conditions observed in scRMS cancer patients.

      We appreciate the reviewer’s input and acknowledge the limitation of the HEK293T and C2C12 cell-based models that rely on ectopic expression of VGLL2-NCOA2 and TEAD1-NCOA2 fusion proteins. It is currently unclear whether the VGLL2-NCOA2 and TEAD1-NCOA2 fusion genes are also amplified in sarcoma. As mentioned before, these surrogate cell culture systems allowed us to systemically compare the transcriptional regulation by the fusion proteins and YAP/TAZ and elucidate the molecular mechanism underlying the Hippo/YAP-independent oncogenic transformation induced by VGLL2-NCOA2 and TEAD1-NCOA2.

      References:

      (1) Genes Dev . 2007 Nov 1;21(21):2747-61. doi: 10.1101/gad.1602907. Inactivation of YAP oncoprotein by the Hippo pathway is involved in cell contact inhibition and tissue growth control

      (2) Genes Dev . 2010 Jan 1;24(1):72-85. doi: 10.1101/gad.1843810. A coordinated phosphorylation by Lats and CK1 regulates YAP stability through SCF(beta-TRCP)

      (3) VGLL2-NCOA2 leverages developmental programs for pediatric sarcomagenesis. Watson S, LaVigne CA, Xu L, Surdez D, Cyrta J, Calderon D, Cannon MV, Kent MR, Cell Rep. 2023 Jan 31;42(1):112013.

      (4) Lats1/2 Sustain Intestinal Stem Cells and Wnt Activation through TEAD-Dependent and Independent Transcription. Cell Stem Cell. 2020 May 7;26(5):675-692.e8.

      (5) Yi, P., Yu, X., Wang, Z., and O’Malley, B.W. (2021). Steroid receptor-coregulator transcriptional complexes: new insights from CryoEM. Essays Biochem. 65, 857–866.

      (6) A Molecular Study of Pediatric Spindle and Sclerosing Rhabdomyosarcoma: Identification of Novel and Recurrent VGLL2-related Fusions in Infantile Cases. Am J Surg Pathol . 2016 Feb;40(2):224-35. doi: 10.1097/

      (7) CITED2 and the modulation of the hypoxic response in cancer. Fernandes MT, Calado SM, Mendes-Silva L, Bragança J.World J Clin Oncol. 2020 May 24;11(5):260-274.

      (8) Yu, X., Yi, P., Hamilton, R.A., Shen, H., Chen, M., Foulds, C.E., Mancini, M.A., Ludtke, S.J., Wang, Z., and O’Malley, B.W. (2020). Structural insights of transcriptionally active, full-length Androgen receptor coactivator complexes. Mol. Cell 79, 812–823.e4.

    1. eLife Assessment

      This important study substantially expands observations of HERV expression in the clinical settings. The evidence provided by the authors that HERV activity is an underlying etiological factor in ME/CFS and fibromyalgia is compelling and suggests further investigation into mechanisms. This work will be of broad interest to clinicians and researchers alike.

    2. Reviewer #1 (Public review):

      Summary:

      Giménez-Orenga et al. investigate the origin and pathophysiology of myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) and fibromyalgia (FM). Using RNA microarrays, the authors compare the expression profiles and evaluate the biomarker potential of human endogenous retroviruses (HERV) in these two conditions. Altogether, the authors show that HERV expression is distinct between ME/CFS and FM patients, and HERV dysregulation is associated with higher symptom intensity in ME/CFS. HERV expression in ME/CFS patients is associated with impaired immune function and higher estimated levels of plasma cells and resting CD4 memory T cells. This work provides interesting insights into the pathophysiology of ME/CFS and FM, creating opportunities for several follow-up studies.

      Strengths:

      (1) Overall, the data is convincing and supports the authors' claims. The manuscript is clear and easy to understand, and the methods are generally well-detailed. It was quite enjoyable to read.<br /> (2) The authors combined several unbiased approaches to analyse HERV expression in ME/CFS and FM. The tools, thresholds, and statistical models used all seem appropriate to answer their biological questions.<br /> (3) The authors propose an interesting alternative to diagnosing these two conditions. Transcriptomic analysis of blood samples using an RNA microarray could allow a minimally invasive and reproducible way of diagnosing ME/CFS and FM.

      Weakness:<br /> (1) While this work makes several intriguing observations, some results will need to be validated in future studies using experimental approaches.

    3. Reviewer #2 (Public review):

      Summary:

      Giménez-Orenga carried out this study to assess whether human endogenous retroviruses (HERVs) could be used to improve the diagnosis of Myalgic Encephalomyelitis/Chronic Fatigue Syndrome (ME/CFS) and Fibromyalgia (FM). To this end, they used the HERV-V3 array developed previously, to characterize the genome-wide changes in expression of HERVs in patients suffering from ME/CFS, FM or both, compared to controls. In turn, they present a useful repertoire of HERVs that might characterize ME/CFS and FM. For most part, the paper is written in a manner that allows a natural understanding of the workflow and analyses carried out, making it compelling. The figures and additional tables presents solid support for the findings. However, some statements made by the authors seem incomplete and would benefit by a more thorough literature review. Overall, this work will be of interest to the medical community seeking in better understanding the co-occurrence of these pathologies, hinting at a novel angle by integrating HERVs, which are often overlooked, into their assessment.

      Strengths:

      - The work is well-presented, allowing the reader to understand the overall workflow and how the specific aims contribute to filling the knowledge gap in the field.

      - The analyses carried out to understand the potential impact on gene expression mediated by HERVs are in line with previous works, making it solid and robust in the context of this study.

      Weaknesses:

      - The authors claim to obtain genome-wide HERV expression profiles. However, the array used was developed using hg19, while the genomic analysis of this work are carried out using a liftover to hg38. It would improve the statement and findings to include a comparation of the differences in HERVs available in hg38, and how this could impact the "genome-wide" findings.

      - The authors in some points are not thorough with the cited literature. Two examples are:<br /> (1) Lines 396-397 the authors say "the MLT1, usually found enriched near DE genes (Bogdan et al., 2020)". I checked the work by Bogdan, and they studied bacterial infection. A single work in a specific topic is not sufficient to support the statement that MLT1 is "usually" in close vicinity to differentially expressed genes. More works are needed to support this.<br /> (2) After the previous statement, the authors go on to mention "contributing to the coding of conserved lncRNAs (Ramsay et al., 2017)". First, lnc = long non-coding, so this doesn't make sense. Second, in the work by Ramsay they mention "that contributed a significant amount of sequence to primate lncRNAs whose expression was conserved", which is different to what the authors in this study are trying to convey. Again, additional work and a rephrasing might help to support this idea.

      - When presenting the clusters, the authors overlook the fact that cluster 4 is clearly control-specific, and fail to discuss what this means. Could this subset of HERV be used as bona fide markers of healthy individuals in the context of these diseases? Are they associated with DE genes? What could be the impact of such associations?

      Appraisals on aims:

      The authors set specific questions and presented the results to successfully answer them. The evidence is solid, with some weaknesses discussed above that will methodologically strengthen the work.

      Likely impact of work on the field:<br /> This work will be of interest to the medical community looking for novel ways to improve clinical diagnosis. Although future works with a greater population size, and more robust techniques such as RNA-Seq, are needed, this is the first step in presenting a novel way to distinguish these pathologies.

      It would be of great benefit to the community to provide a table/spreadsheet indicating the specific genomic locations of the HERVs specific to each condition. This will allow proper provenance for future researchers interesting in expanding on this knowledge, as these genomic coordinates will be independent of the technique used (as was the array used here).

      Comments on revisions:

      When addressing the comments made in the previous round, there are some answers that lack substance and don't seem to be incorporated in the manuscript. For example, the authors say:

      Authors' response: This is an important point. However, the low number of probes (less than 100) that were excluded from our analysis by lack of correspondence with hg38 among the 1,290,800 probesets was interpreted as insignificant for "genome-wide" claims. An aspect that will be explained in the revised version of this manuscript.

      I checked the revised manuscript with tracked changes, and there doesn't seem to be an updated explanation to this. In which lines is this explained?

      For the other response:

      Authors' response: Using control DE HERV as bona fide markers of healthy individuals seems like an interesting possibility worth exploring. Control DE HERV (cluster 4) associate with DE genes involved in apoptosis, T cell activation and cell-cell adhesion (modules 1 and 6). The impact of which deserves further study.

      I couldn't find an updated mention of this in the discussion.

      Another point that I raised was regarding the decision of using an FDR of 0.1 instead of 0.05. The authors only speculate about the impacts in their answer, while I believe that this could have been rigorously addressed. Since this was done in R, and DE analysis are relatively fast, I don't see a reason as to why this part was not repeated and discussed accordingly.

      For other analyses, there doesn't seem to be a problem with using 0.05 as threshold. Examples of this are the "Overrepresentation functional analysis", or the "Statistical analysis" part of the methods they say "we used a Fisher exact test to calculate p-value, considering enriched in the provided list if an adjusted p-value (FDR) was less than 0.05".

      Just to make this point clear: I'm not asking the authors to repeat all the work using the 0.05 FDR threshold, but rather that they are aware and conscious about the impact of this, and give an idea to the audience on how it would change the DE numbers. This would put in perspective the findings to any future reader.

      I think that most of the other answers to both my previous concerns and the other reviewer's concerns are ok. My last outstanding concern is that the probe coordinates apparently can't be shared, which undermines a lot this study reproducibility, and its use by future researches which won't be able to compare their results to this study.

    4. Reviewer #3 (Public review):

      Summary:

      The authors find that HERV expression patterns can be used as new criteria for differential diagnosis of FM and ME/CFS and patient subtyping. The data are based on transcriptome analysis by microarray for HERVs using patient blood samples, followed by differential expression of ERVs and bioinformatic analyses. This is a standard and solid data processing pipeline, and the results are well presented and support the authors' claim.

      Strengths:

      It provides an innovative diagnostic approach using ERV profiles to subtype patients and distinguish FM and ME/CFS.

      Comments on revisions:

      This is a revised manuscript which addresses the comments well.

    5. Author response:

      The following is the authors’ response to the original reviews

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      Giménez-Orenga et al. investigate the origin and pathophysiology of myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) and fibromyalgia (FM). Using RNA microarrays, the authors compare the expression profiles and evaluate the biomarker potential of human endogenous retroviruses (HERV) in these two conditions. Altogether, the authors show that HERV expression is distinct between ME/CFS and FM patients, and HERV dysregulation is associated with higher symptom intensity in ME/CFS. HERV expression in ME/CFS patients is associated with impaired immune function and higher estimated levels of plasma cells and resting CD4 memory T cells. This work provides interesting insights into the pathophysiology of ME/CFS and FM, creating opportunities for several follow-up studies.

      Strengths:

      (1) Overall, the data is convincing and supports the authors' claims. The manuscript is clear and easy to understand, and the methods are generally well-detailed. It was quite enjoyable to read.

      (2) The authors combined several unbiased approaches to analyse HERV expression in ME/CFS and FM. The tools, thresholds, and statistical models used all seem appropriate to answer their biological questions.

      (3) The authors propose an interesting alternative to diagnosing these two conditions. Transcriptomic analysis of blood samples using an RNA microarray could allow a minimally invasive and reproducible way of diagnosing ME/CFS and FM.

      Weaknesses:

      (1) The cohort analysed in this study was phenotyped by a single clinician. As ME/CFS and FM are diagnosed based on unspecific symptoms and are frequently misdiagnosed, this raises the question of whether the results can be generalised to external cohorts.

      Thank you for your comment. Surely the study of larger cohorts will determine the external validity of these results in a clinical scenario. However, this pilot study, first of its kind, was designed to maximize homogeneity across participants which seemed primarily ensured by the study of females only and diagnosis by a single experienced observer.

      (2) The analyses performed to unravel the causes and effects of HERV expression in ME/CFS and FM are solely based on sequencing data. Experimental approaches could be used to validate some of the transcriptomic observations.

      Certainly, experimental approaches may add robustness to the implication of HERVs in ME/CFS. We indeed consider taking this avenue to deepen in the findings presented here for future work. However, the limited knowledge of HERV-mediated physiological functions may hamper the obtention of prompt results towards revealing causes and effects of HERV expression in ME/CFS and FM.

      Reviewer #2 (Public review):

      Summary:

      Giménez-Orenga carried out this study to assess whether human endogenous retroviruses (HERVs) could be used to improve the diagnosis of Myalgic Encephalomyelitis/Chronic Fatigue Syndrome (ME/CFS) and Fibromyalgia (FM). To this end, they used the HERV-V3 array developed previously, to characterize the genome-wide changes in the expression of HERVs in patients suffering from ME/CFS, FM, or both, compared to controls. In turn, they present a useful repertoire of HERVs that might characterize ME/CFS and FM. For the most part, the paper is written in a manner that allows a natural understanding of the workflow and analyses carried out, making it compelling. The figures and additional tables present solid support for the findings. However, some statements made by the authors seem incomplete and would benefit from a more thorough literature review. Overall, this work will be of interest to the medical community seeking in better understanding of the co-occurrence of these pathologies, hinting at a novel angle by integrating HERVs, which are often overlooked, into their assessment.

      Strengths:

      (1) The work is well-presented, allowing the reader to understand the overall workflow and how the specific aims contribute to filling the knowledge gap in the field.

      (2) The analyses carried out to understand the potential impact on gene expression mediated by HERVs are in line with previous works, making it solid and robust in the context of this study.

      Weaknesses:

      (1) The authors claim to obtain genome-wide HERV expression profiles. However, the array used was developed using hg19, while the genomic analysis of this work are carried out using a liftover to hg38. It would improve the statement and findings to include a comparison of the differences in HERVs available in hg38, and how this could impact the "genome-wide" findings.

      This is an important point. However, the low number of probes (less than 100) that were excluded from our analysis by lack of correspondence with hg38 among the 1,290,800 probesets was interpreted as insignificant for "genome-wide" claims. An aspect that will be explained in the revised version of this manuscript.

      (2) The authors in some points are not thorough with the cited literature. Two examples are:

      a) Lines 396-397 the authors say "the MLT1, usually found enriched near DE genes (Bogdan et al., 2020)". I checked the work by Bogdan, and they studied bacterial infection. A single work in a specific topic is not sufficient to support the statement that MLT1 is "usually" in close vicinity to differentially expressed genes. More works are needed to support this.

      b) After the previous statement, the authors go on to mention "contributing to the coding of conserved lncRNAs (Ramsay et al., 2017)". First, lnc = long non-coding, so this doesn't make sense. Second, in the work by Ramsay they mention "that contributed a significant amount of sequence to primate lncRNAs whose expression was conserved", which is different from what the authors in this study are trying to convey. Again, additional work and a rephrasing might help to support this idea.

      Certainly, these two sentences need rephrasing to better adjust to current evidence.

      Revised sentences can now be found in lines 397-402

      (3) When presenting the clusters, the authors overlook the fact that cluster 4 is clearly control-specific, and fail to discuss what this means. Could this subset of HERV be used as bona fide markers of healthy individuals in the context of these diseases? Are they associated with DE genes? What could be the impact of such associations?

      Using control DE HERV as bona fide markers of healthy individuals seems like an interesting possibility worth exploring. Control DE HERV (cluster 4) associate with DE genes involved in apoptosis, T cell activation and cell-cell adhesion (modules 1 and 6). The impact of which deserves further study.

      Appraisals on aims:

      The authors set specific questions and presented the results to successfully answer them. The evidence is solid, with some weaknesses discussed above that will methodologically strengthen the work.

      Likely impact of work on the field:

      This work will be of interest to the medical community looking for novel ways to improve clinical diagnosis. Although future works with a greater population size, and more robust techniques such as RNA-Seq, are needed, this is the first step in presenting a novel way to distinguish these pathologies.

      It would be of great benefit to the community to provide a table/spreadsheet indicating the specific genomic locations of the HERVs specific to each condition. This will allow proper provenance for future researchers interested in expanding on this knowledge, as these genomic coordinates will be independent of the technique used (as was the array used here).

      We agree with the reviewer that sharing genomic locations of DE HERVs in these pathologies would contribute to the development of these findings. Unfortunately, we do not hold the rights to share probe coordinates from this custom HERV-V3 microarray which we used under MTA agreement with its developer.

      Reviewer #3 (Public review):

      The authors find that HERV expression patterns can be used as new criteria for differential diagnosis of FM and ME/CFS and patient subtyping. The data are based on transcriptome analysis by microarray for HERVs using patient blood samples, followed by differential expression of ERVs and bioinformatic analyses. This is a standard and solid data processing pipeline, and the results are well presented and support the authors' claim.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      Recommandations/questions:

      (1) The authors point towards the biomarker potential of HERV expression signatures. In line with this, it would be important to test if they can predict the correct pathology for patients using the expression of DE HERVs. Additionally, as a single clinician annotated the cohort analysed in this study, it would be interesting to validate the signatures identified in this work by reanalysing publicly available transcriptomic data from independent studies.

      Thank you for the suggestion. We plan to conduct this analysis and have added the following statement to the manuscript (lines 482-483): “Given the limited sample size in our cohort, validation of the findings in extended cohorts is a must.”

      (2) The authors suggest that an epigenetic mechanism causes the dysregulated HERV expression in ME/CFS patients. However, in Fig.1A, HERV expression profiles of co-diagnosed patients are more similar to healthy controls than patients with either condition. How could the co-morbidity of FM "rescue" the phenotype of ME/CFS?

      Thank you for the insightful comment. It is notable that co-diagnosed patients exhibit HERV expression profiles more similar to those of healthy controls than to either FM´s or ME/CFS´s. These findings may suggest a distinct underlying pathomechanism for this patient group, supporting the identification of a novel nosologic entity, as discussed in lines 372-374 of the manuscript.

      (3) Abundant evidence in the literature links HERV dysregulation with the production of RNA:DNA hybrids and dsRNAs and viral mimicry. The authors found that ME/CFS subgroup 2, which exhibits the most important HERV dysregulation, is also associated with decreased signatures of pathogen detection. It would be interesting to quantify the abundance of DNA:RNA hybrids and dsRNAs in PBMCs of ME/CFS and FM patients as well as healthy controls. It would be interesting to discuss how downregulation of pathogen detection pathways could be a mechanism in ME/CFS patients to avoid viral mimicry and potential links with inflammation in this disease.

      Certainly, HERVs can influence disease pathophysiology by generating RNA:DNA hybrids and dsRNA. However, microarray data does not allow this analysis. Future actions to investigate the underlying mechanisms of differentially expressed HERVs could investigate this interesting possibility.

      (4) Another intriguing result is how overexpression of Module 3 in ME/CFS subgroup 2 is associated with higher levels of plasma cells. The authors hypothesize that the changes in immune cell abundances reflect previous viral infections, but another possibility would be immune activation against HERVs. Are there protein-coding sequences (gag, pro, pol, env) amongst the HERV sequences of module 3? If so, it would be interesting to validate HERV protein expression in these samples. Additionally, blood samples of ME/CFS patients and healthy controls should be analysed in flow cytometry to describe the abundance and phenotype of immune cells precisely.

      Thank you for your insightful comments. In fact, we identified three HERV elements with protein-coding regions whose functional relevance remains uncertain. They present an interesting avenue for future investigation, particularly regarding immune activation.

      Minor comments:

      (1) On lines 170-172, it is unclear to me how Figure 1E is linked to the text.

      We have added a line better explaining Fig. 1E: “Top 10 contributing HERVs to principal components PC1 and PC2 are shown” (lines 171-172).

      (2) Figure S2: grouping or colouring the plots based on the cluster to which HERVs were assigned could facilitate the understanding of the figure.

      We appreciate the suggestion to enhance the clarity of the figures. However, this color-coding cannot be implemented, as a family is not exclusively assigned to a single cluster.

      (3) How are the 4 HERV clusters of Figure 2 and the 8 modules of Figure 3 related to the clusters identified by hierarchical clustering in Figure 1? More details should be provided in the text (Results and Methods sections), and figures to illustrate the clustering strategy should be added if needed.

      To enhance clarity, we have included the following explanation in the results section (lines 244-251): “To uncover potentially affected physiologic functions linked to DE HERV, we examined how DE HERVs and DE genes with similar expression patterns grouped together in modules based on their intrinsic relationships by their hierarchical co-clustering (Fig. 3). Then, the functional significance of these modules was assessed by gene ontology (GO) analysis of the DE genes within each module. The hierarchical clustering analysis resulted in the identification of eight distinct modules, each characterized by unique combinations of DE HERV and DE gene patterns across all four study groups (Fig. 3)”.

      (4) Related to Figure 4, are there HERV sequences in module 3 located near genes important for plasma cells and/or resting CD4 memory T cells?

      Thank you for your insightful comment. However, gene relevance for plasma cells and/or resting CD4 memory T cells may depend on multiple factors in addition to cell type and subtypes and, therefore, the analysis may not be straight forward.

      Reviewer #2 (Recommendations for the authors):

      In Figure 1, the heatmap scale goes from -4 to 4. This should reflect at least the numbers on the lowest and highest end of the scale.

      Thank you for bringing this to our attention. The scale was correct; however, when arranging the panels, the numbers were not properly positioned. The figure has now been updated with the corrected version.

      Figure 2F and G, percentages are shown as decimal numbers up to 1.00, while it should be 100%, and so on.

      We also replaced this figure, changing the numbers to fit percentages.

      It would be interesting to know how the results change using FDR of 0.05. I'm not familiar with microarray thresholds, but in RNA-Seq, 0.1 is rarely used, with 0.05 being the standard. Could it be that a more stringent result better distinguishes the pathologies?

      Applying a more stringent threshold, such as FDR 0.05, may remove sequences that, while not strongly differentially expressed, may be still important for distinguishing between these pathologies. Therefore, we decided to also include DE tendencies (FDR<0.1) in this first of a kind study. Findings will need validation in enlarged cohorts.

    1. eLife Assessment

      The study by Power and colleagues is important as elucidating the dynamic immune responses to photoreceptor damage in vivo potentiates future work in the field to better understand the disease process. The evidence supporting the authors' claims is compelling. The current manuscript would further benefit from including limitations/future improvements in the discussion or conclusion, exploring neutrophil recruitment under different degree of photoreceptor loss (mild to severe).

    2. Reviewer #2 (Public review):

      Summary:

      This study uses in vivo multimodal high-resolution imaging to track how microglia and neutrophils respond to light-induced retinal injury from soon after injury to 2 months post-injury. The in vivo imaging finding was subsequently verified by ex vivo study. The results suggest that despite the highly active microglia at the injury site, neutrophils were not recruited in response to acute light-induced retinal injury.

      Strengths:

      An extremely thorough examination of the cellular-level immune activity at the injury site. In vivo imaging observations being verified using ex vivo techniques is a strong plus.

      Weaknesses:

      This paper is extremely long, and in the perspective of this reviewer, needs to be better organized. Update: Modifications have been made throughout, which has made the manuscript easier to follow.

      Study weakness: though the finding prompts more questions and future studies, the findings discussed in this paper is potentially important for us to understand how the immune cells respond differently to different severity level of injury. The study also demonstrated an imaging technology which may help us better understand cellular activity in living tissue during earlier time points.

      Comments on revisions:

      I appreciate the thorough clarification and re-organization by the authors, and the messages in the manuscript are now more apparent. I recommend also briefly discussing limitations/future improvements in the discussion or conclusion.

    3. Reviewer #3 (Public review):

      Summary

      This work investigated the immune response in the murine retina after focal laser lesions. These lesions are made with close to 2 orders of magnitude lower laser power than the more prevalent choroidal neovascularization model of laser ablation. Histology and OCT together show that the laser insult is localized to the photoreceptors and spares the inner retina, the vasculature and the pigment epithelium. As early as 1-day after injury, a loss of cell bodies in the outer nuclear layer is observed. This is accompanied by strong microglial proliferation to the site of injury in the outer retina where microglia do not typically reside. The injury did not seem to result in the extravasation of neutrophils from the capillary network, constituting one of the main findings of the paper. The demonstrated paradigm of studying the immune response and potentially retinal remodeling in the future in vivo is valuable and would appeal to a broad audience in visual neuroscience.

      Strengths

      Adaptive optics imaging of murine retina is cutting edge and enables non-destructive visualization of fluorescently labeled cells in the milieu of retinal injury. As may be obvious, this in vivo approach is a benefit for studying fast and dynamic immune processes on a local time scale - minutes and hours, and also for the longer days-to-months follow-up of retinal remodeling as demonstrated in the article. In certain cases, the in vivo findings are corroborated with histology.

      The analysis is sound and accompanied by stunning video and static imagery. A few different sets of mouse models are used, a) two different mouse lines, each with a fluorescent tag for neutrophils and microglia, b) two different models of inflammation - endotoxin-induced uveitis (EAU) and laser ablation are used to study differences in the immune interaction.

      One of the major advances in this article is the development of the laser ablation model for 'mild' retinal damage as an alternative to the more severe neovascularization models. This model would potentially allow for controlling the size, depth and severity of the laser injury opening interesting avenues for future study.

      The time-course, 2D and 3D spatial activation pattern of microglial activation are striking and provide an unprecedented view of the retinal response to mild injury.

      Weaknesses

      Generalization of the (lack of) neutrophil response to photoreceptor loss - there is ample evidence in literature that neutrophils are heavily recruited in response to severe retinal damage that includes photoreceptor loss. Why the same was not observed here in this article remains an open question. One could hypothesize that neutrophil recruitment might indeed occur under conditions that are more in line with the more extreme damage models, for example, with a stronger and global ablation (substantially more photoreceptor loss over a larger area). This parameter space is unwieldy and sufficiently large to address the question conclusively in the current article, i.e. how much photoreceptor loss leads to neutrophil recruitment? By the same token, the strong and general conclusion in the title - Photoreceptor loss does not recruit neutrophils - cannot be made until an exhaustive exploration be made of the same parameter space. A scaling back may help here, to reflect the specific, mild form of laser damage explored here, for instance - Mild photoreceptor loss does not recruit neutrophils despite...

      EIU model - The EIU model was used as a positive control for neutrophil extravasation. Prior work with flow cytometry has shown a substantial increase in neutrophil counts in the EIU model. Yet, in all, the entire article shows exactly 2 examples in vivo and 3 ex vivo (Figure 7) of extravasated neutrophils from the EIU model (n = 2 mice). The general conclusion made about neutrophil recruitment (or lack thereof) is built partly upon this positive control experiment. But these limited examples, especially in the case where literature reports a preponderance of extravasated neutrophils, raise a question on the paradigm(s) used to evaluate this effect in the mild laser damage model.

      Overall, the strengths outweigh the weaknesses, provided the conclusions/interpretations are reconsidered.

    4. Author response:

      The following is the authors’ response to the previous reviews

      Reviewer #1 (Public review):

      Summary:

      The authors aimed to investigate the interaction between tissue-resident immune cells (microglia) and circulating systemic neutrophils in response to acute, focal retinal injury. They induced retinal lesions using 488 nm light to ablate photoreceptor (PR) outer segments, then utilized various imaging techniques (AOSLO, SLO, and OCT) to study the dynamics of fluorescent microglia and neutrophils in mice over time. Their findings revealed that while microglia showed a dynamic response and migrated to the injury site within a day, neutrophils were not recruited to the area despite being nearby. Post-mortem confocal microscopy confirmed these in vivo results. The study concluded that microglial activation does not recruit neutrophils in response to acute, focal photoreceptor loss, a scenario common in many retinal diseases.

      Strengths:

      The primary strength of this manuscript lies in the techniques employed.

      In this study, the authors utilized advanced Adaptive Optics Scanning Laser Ophthalmoscopy (AOSLO) to document immune cell interactions in the retina accurately. AOSLO's micron-level resolution and enhanced contrast, achieved through near-infrared (NIR) light and phase-contrast techniques, allowed visualization of individual immune cells without extrinsic dyes. This method combined confocal reflectance, phase-contrast, and fluorescence modalities to reveal various cell types simultaneously. Confocal AOSLO tracked cellular changes with less than 6 μm axial resolution, while phase-contrast AOSLO provided detailed views of vascular walls, blood cells, and immune cells. Fluorescence imaging enabled the study of labeled cells and dyes throughout the retina. These techniques, integrated with conventional histology and Optical Coherence Tomography (OCT), offered a comprehensive platform to visualize immune cell dynamics during retinal inflammation and injury.

      Thank you!

      Weaknesses:

      One significant weakness of the manuscript is the use of Cx3cr1GFP mice to specifically track GFP-expressing microglia. While this model is valuable for identifying resident phagocytic cells when the blood-retinal barrier (BRB) is intact, it is important to note that recruited macrophages also express the same marker following BRB breakdown. This overlap complicates the interpretation of results and makes it difficult to distinguish between the contributions of microglia and infiltrating macrophages, a point that is not addressed in the manuscript.

      We agree that greater emphasis is required that CX3CR1 mice exhibit fluorescence in not only microglia, but also other cells of macrophage origin including monocytes, perivascular macrophages and some hyalocytes.

      Through the advantages of in vivo AOSLO, however, we are able to establish that CX3CR1 cells are present within the tissue before the laser lesion is placed. This suggests they are tissue resident. We agree that it is possible that at later time points (days-weeks), systemic macrophages and/or monocytes may participate. Lack of rolling/crawling cells suggest they are not systemic. We elaborate on this point in a new section in the discussion:

      P29 L534-541:

      “CX3CR1-GFP mice exhibit fluorescence not only in microglia

      We recognize that the CX3CR1-GFP model can also label systemic cells such as monocytes/macrophages77. While it is possible these cells could infiltrate the retina in response to the lesion, we find it unlikely since there was no indication of the leukocyte extravasation cascade (rolling/crawling/stalled cells) within the nearest retinal vasculature. In addition to microglia, retinal perivascular macrophages and hyalocytes also exhibit GFP fluorescence and thus that these cells may also contribute toward damage resolution.”

      Another major concern is the time point chosen for analyzing the neutrophil response. The authors assess neutrophil activity 24 hours after injury, which may be too late to capture the initial inflammatory response. This delayed assessment could overlook crucial early dynamics that occur shortly after injury, potentially impacting the overall findings and conclusions of the study.

      The power of in vivo imaging makes these early assessments possible. Therefore, we have taken the reviewers concern and conducted an additional experiment which examines whether neutrophils are seen in the window of time between lesion and 24hrs. In a newly examined mouse, we find that within 3.5 hours post-lesion, neutrophils do not extravasate adjacent to the lesion site (see new “figure 8 – figure supplement 1”).

      Also see accompanying video (new “figure 8 – video 3”) for an example of nearby neutrophils flowing through OPL capillaries just microns away from the lesion site. Neutrophils are clearly contained within the vasculature and exhibit dynamics consistent with healthy retinal tissue. While it remains possible that the lesion may increase leukocyte stalling within the nearest capillaries, we are unable to confirm or deny this with a single experiment. We now submit this evidence as a new supplementary figure following the reviewer’s suggestion.

      Reviewer #2 (Public review):

      Summary:

      This study uses in vivo multimodal high-resolution imaging to track how microglia and neutrophils respond to light-induced retinal injury from soon after injury to 2 months post-injury. The in vivo imaging finding was subsequently verified by an ex vivo study. The results suggest that despite the highly active microglia at the injury site, neutrophils were not recruited in response to acute light-induced retinal injury.

      Strengths:

      An extremely thorough examination of the cellular-level immune activity at the injury site. In vivo imaging observations being verified using ex vivo techniques is a strong plus.

      We appreciate this recognition and hope that the reviewer considers the weaknesses below in the context of the papers identified strengths.

      Weaknesses:

      This paper is extremely long, and in the perspective of this reviewer, needs to be better organized.

      We agree and have taken the following steps to address this:

      (1) Paper has been shortened overall by 8%

      (2) We reorganized the following sections:

      a. Introduction: shortened

      b. Methods: merged section “Ex vivo confocal image processing” with “Ex vivo confocal imaging”.

      c. Results: most sections shortened, others simplified for concision

      d. Discussion: most sections shortened, removed “Microglial/neutrophil discrimination using label-free phase contrast”

      e. Figure references reorganized in order of their appearance.

      Study weakness: though the finding prompts more questions and future studies, the findings discussed in this paper are potentially important for us to understand how the immune cells respond differently to different severity levels of injury.

      On the heels of this burgeoning technology, we consider this report among the first studies of its kind. We are hopeful that it forms the foundation of many further investigations to come. We expect a rich parameter space to be explored with future studies including investigation of other time points, other injuries of varying degree and other immune cell populations (along with their interactions with each other). Each has the potential to reveal the complexities of the ocular immune system in action.

      Reviewer #3 (Public review):

      Summary:

      This work investigated the immune response in the murine retina after focal laser lesions. These lesions are made with close to 2 orders of magnitude lower laser power than the more prevalent choroidal neovascularization model of laser ablation. Histology and OCT together show that the laser insult is localized to the photoreceptors and spares the inner retina, the vasculature, and the pigment epithelium. As early as 1-day after injury, a loss of cell bodies in the outer nuclear layer is observed. This is accompanied by strong microglial proliferation at the site of injury in the outer retina where microglia do not typically reside. The injury did not seem to result in the extravasation of neutrophils from the capillary network constituting one of the main findings of the paper. The demonstrated paradigm of studying the immune response and potentially retinal remodeling in the future in vivo is valuable and would appeal to a broad audience in visual neuroscience. However, there are some issues with the conclusions drawn from the data and analysis that can be addressed to further bolster the manuscript.

      Strengths:

      Adaptive optics imaging of the murine retina is cutting edge and enables non-destructive visualization of fluorescently labeled cells in the milieu of retinal injury. As may be obvious, this in vivo approach is beneficial for studying fast and dynamic immune processes on a local time scale - minutes and hours, and also for the longer days-to-months follow-up of retinal remodeling as demonstrated in the article. In certain cases, the in vivo findings are corroborated with histology.

      Thank you!

      The analysis is sound and accompanied by stunning video and static imagery. A few different sets of mouse models are used, (a) two different mouse lines, each with a fluorescent tag for neutrophils and microglia, (b) two different models of inflammation - endotoxin-induced uveitis (EAU) and laser ablation are used to study differences in the immune interaction.

      Thank you!

      One of the major advances in this article is the development of the laser ablation model for 'mild' retinal damage as an alternative to the more severe neovascularization models. While not directly shown in the article, this model would potentially allow for controlling the size, depth, and severity of the laser injury opening interesting avenues for future study.

      We agree that there is an established community that is invested in developing titrated dosimetry for light damage models. As the reviewer recognizes, this parameter space is exceptionally large therefore we controlled this parameter by choosing a single wavelength that is commonly used in ophthalmoscopy (488nm), fixed duration and exposure regime that created a reproducible, mild damage of photoreceptors. At this titration we created a mild lesion that spares retina above and below.

      Weaknesses:

      (1) It is unclear based on the current data/study to what extent the mild laser damage phenotype is generalizable to disease phenotypes. The outer nuclear cell loss of 28% and a complete recovery in 2 months would seem quite mild, thus the generalizability in terms of immune-mediated response in the face of retinal remodeling is not certain, specifically whether the key finding regarding the lack of neutrophil recruitment will be maintained with a stronger laser ablation.

      It seems the concern here is whether our finding is generalizable to other damage regimes, especially more severe ones. While speculative, we would suspect that it is not generalizable across different lesions of greater severity. For example, puncturing Bruch’s membrane is an example of a more severe phenotype that is often encountered in laser damage. However, this creates a complicated model that not only induces inflammation, but also compromises BRB integrity and promotes CNV. The parameter space to be tested in the reviewer’s question is quite vast and therefore have tried to summarize the generalizability within our manuscript in

      P31 L586-588 “There are limitations on how generalizable this mild damage to more severe damage or disease phenotypes, but this acute damage model can begin to provide clues about how immune cells interact in response to PR loss. In this laser lesion model, we ablate 27% of the PRs in a 50 µm region.”

      (2) Mice numbers and associated statistics are insufficient to draw strong conclusions in the paper on the activity of neutrophils, some examples are below:

      a) 2 catchup mice and 2 positive control EAU mice are used to draw inferences about immune-mediated activity in response to injury. If the goal was to show 'feasibility' of imaging these mouse models for the purposes of tracking specific cell type behavior, the case is sufficiently made and already published by the authors earlier. It is possible that a larger sample size would alter the conclusion.

      We would like to highlight that the total number of mice studied in this report was 28 (18 in-vivo imaging, 10 ex-vivo histology, >40 lesions total). While power analysis is challenging as these are the first studies of their kind, we underscore that in vivo imaging allows those same mice to be studied multiple times longitudinally. This is not possible with traditional histology. Therefore, in vivo imaging not only reveals the temporal progression (unlike histology), but also increases the number of observations beyond a simple count of the “number of mice”.

      The goal of the study was not one of feasibility. The goal was to address a specific question in ocular biology: “do resident CX3CR1 cells recruit neutrophils in early, regional retinal injury”

      The low numbers that the reviewer points to, are not the primary data of the paper, rather, supportive control data. Moreover, we refocus the attention on the fact that our study is performed on 28 mice across multiple modalities and each corroborates a common finding that neutrophils do not appear to be recruited despite strong microglial response; a central finding of the paper.

      b) There are only 2 examples of extravasated neutrophils in the entire article, shown in the positive control EAU model. With the rare extravasation events of these cells and their high-speed motility, the chance of observing their exit from the vasculature is likely low overall, therefore the general conclusions made about their recruitment or lack thereof are not justified by these limited examples shown.

      The spirit of the challenge raised is that because nothing was seen, is not proof that nothing occurred. Said more commonly, “absence of evidence is not evidence of absence”- a quote often attributed to Carl Sagan. Yet we push back on this conjecture as we have shown, not only with cutting edge in vivo imaging, but also with ample histological controls as well as multiple transgenic animals (and corroborating IHC antibodies) that in none of these imaging modalities, at none of the time points we evaluated, did neutrophils aggregate or extravasate in response to photoreceptor ablation.

      Reviewer adds: “the chance of observing their exit from the vasculature is likely low overall…”

      This is the reason that we specifically chose a focal lesion model to increase any possible chance of imaging a rare event. The focal lesion provides both a time and a location for “where” to look. Small 50 micrometer lesions were sufficient to drive a strong local microglial response (figures 5,6,9). This was evidence that local inflammatory cues were present. Yet despite this activation, neutrophils were not recruited to this location. We emphasize that this is a strength of our approach over other pan-retinal damage models that may indeed miss the rare extravasation events that are geographically sparse and happen over hours.

      c) In Figure 3, the 3-day time point post laser injury shows an 18% reduction in the density of ONL nuclei (p-value of 0.17 compared to baseline). In the case of neutrophils, it is noted that "Control locations (n = 2 mice, 4 z-stacks) had 15 {plus minus} 8 neutrophils per sq.mm of retina whereas lesioned locations (n = 2 mice, 4 z-stacks) had 23 {plus minus} 5 neutrophils per sq.mm of retina (Figure 10b). The difference between control and lesioned groups was not statistically significant (p = 0.19)." These data both come from histology. While the p-values - 0.17 and 0.19 - are similar, in the first case a reduction in ONL cell density is concluded while in the latter, no difference in neutrophil density is inferred in the lesioned case compared to control. Why is there a difference in the interpretation where the same statistical test and methodology are used in both cases? Besides this statistical nuance, is there an alternate possibility that there is an increased, albeit statistically insignificant, concentration of circulating neutrophils in the lesioned model? The increase is nearly 50% (15 {plus minus} 8 vs. 23 {plus minus} 5 neutrophils per sq.mm) and the reader may wonder if a larger animal number might skew the statistic towards significance.

      The statistics and p-values will be dependent on the strategy of analysis performed. As described in the methods, we used a predetermined 50 micron cylinder for our counting analysis based on the average lesion size created. We used this circular window to roughly approximate the size of the common lesion size. However, recall that the damage is created in a single axis (a line projected on the retina) therefore it is possible that the analysis region is too generous to capture the exceptionally local damage.

      While the reviewer is focused on the nuance of statistics, we would like to refocus the conversation on our data that shows that very few neutrophils were observed at all (105 cells from 8 locations, P value reported). But missed in the above critique is that all neutrophils were contained within capillaries (Fig 10). We found no examples of extravasated neutrophils.  This is the major finding and is supported by our in vivo as well as ex vivo confirmation.

      (2) The conclusions on the relative activity of neutrophils and microglia come from separate animals. The reader may wonder why simultaneous imaging of microglia and neutrophils is not shown in either the EAU mice or the fluorescently labeled catchup mice where the non-labeled cell type could possibly be imaged with phase-contrast as has been shown by the authors previously. One might suspect that the microglia dynamics are not substantially altered in these mice compared to the CX3CR1-GFP mice subjected to laser lesions, but for future applicability of this paradigm of in vivo imaging assessment of the laser damage model, including documenting the repeatability of the laser damage model and the immune cell behavior, acquiring these data in the same animals would be critical.

      A double fluorescent mouse (neutrophils and microglia) is a logical next step of this research. In fact, we have now crossed these transgenic mice and are studying this double labeled mouse in a second manuscript in preparation. However, for this study, it was imperative that the fluorescent imaging light was kept at low levels as not to contribute or alter the lesion phenotype and accompanying immune response. Therefore, imaging two fluorescent channels to simultaneously view neutrophils and microglia in the same animal would have required at least 2X the visible light exposure for imaging. The imaging light levels used in the current study were carefully examined in our previous publications as to not create additional light damage (Joseph et al 2021).

      (3) Along the same lines as above, the phase contrast ONL images at time points from 3-day to 2-month post laser injury are not shown and the absence of this data is not addressed. This missing data pertains only to the in vivo imaging mice model but are conducted in histology that adequately conveys the time-course of cell loss in the ONL.

      The ocular preparation of the phase contrast data in figure 2, unfortunately developed an anesthesia induced cataract that precluded adequate image quality. This is not uncommon in long-term mouse ocular imaging preparations (Feng et al 2023). Instead, we chose to include the phase-contrast data to show the visually compelling intact and disrupted ONL damage for baseline and 1 day to show that the damage is not only focal, but also shows clear disruption to the somatic layers of the photoreceptors.

      It is suggested that the reason be elaborated for the exclusion of this data and the simultaneous imaging of microglia and neutrophils mentioned above.

      We agree and we have included the reason for the “not acquired” data within the figure 2 legend:

      “Phase contrast data was not acquired for time points 3 days-2 months due to development of cataract which obscured the phase contrast signal”

      Also, it would be valuable to further qualify and check the claims in the Discussion that "ex vivo analysis confirms in vivo findings" and "Microglial/neutrophil discrimination using label-free phase contrast"

      We maintain that ex vivo analysis both corroborates and in many cases, confirms our in vivo findings. We feel this is a strength of our manuscript rather than a qualifier. A) Damage localization is visible with OCT and confocal/phase contrast AOSLO in a region that matches the DAPI loss we see ex vivo. B) Disruption of the ONL seen with in vivo AOSLO is of the same size, shape and location as the ONL damage quantified ex vivo. C) No damage or disruption was seen in locations above the lesion with OCT or AOSLO, which matches our finding that only the ONL shows loss of nuclei whereas other more superficial layers are spared. D) Microglial localization is found both in vivo and ex vivo and E) lack of neutrophil aggregation or extravasation was neither seen in vivo or ex vivo. Given the evidence above, we contend that this strong synergistic and complementary approach corroborates the experimental data in two ways of studying this tissue.

      We agree that the claims made in the section entitled “Microglial/neutrophil discrimination using label-free phase contrast” are not strongly supported by the phase-contrast imaging presented in this paper. Accordingly, we have since removed this section based on reviewer suggestion.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      (1) Based on the title and abstract, the main focus of the manuscript appears to be the immune response. However, most of the manuscript is dedicated to the authors' imaging technique. Additionally, several important concerns regarding the investigation of the immune response in the retina need to be addressed.

      We understand that emphasis may appear to be on the imaging technique, however, because AOSLO is not a widely used technology, we are committed to explaining the technique so that it both builds awareness and confidence in the way this exciting new data is acquired.

      (2) The authors indicate '1 day post-injury' as a timeframe spanning between 18 and 28 hours post-injury. This is a rather wide window of time, which could potentially affect the analysis. It is necessary to demonstrate that there is no significant difference in the immune response, particularly in terms of microglial morphology and branch orientation, between 18 and 28 hours post-injury.

      We agree that a fine time scale may show even greater insight to the natural history of the inflammatory response. However, we feel that our chosen time points go above and beyond the temporal precision that is offered by other investigations, especially considering the novel multi-modal imaging performed here. Studies using finer temporal sampling are poised for future investigation.

      (3) The authors should consider using additional markers or complementary techniques to differentiate between microglia and recruited macrophages, such as incorporating immunohistochemistry with P2RY12, a specific marker for microglia that helps distinguish them from macrophages, and CD68 or F4/80, markers for recruited macrophages. It is also crucial for the authors to include a discussion addressing the limitations of using Cx3cr1GFP mice and the potential impact on result interpretation. It is fundamental to validate the findings and clarify the roles of microglia and macrophages.

      The wonders of current IHC is that there are myriad antibodies and labels that “could” be used. We used what we felt were the most compelling for this stage of early investigation. We look forward to studies that employ this wider range of labels. See our response to reviewer 1’s first comment above for addressing the limitations of using Cx3CR1 mice.

      (4) Analyzing neutrophil responses at 24 hours post-injury may be too late to capture the critical early dynamics of inflammation. By this time, the initial recruitment and activation phases of neutrophils may have already peaked or begun to resolve, potentially missing key insights into the immediate immune response. The authors should conduct additional analysis of neutrophil responses at earlier time points post-injury, such as 6 or 12 hours. Including these time points would provide a more comprehensive and conclusive analysis of the neutrophil response, helping to delineate the progression of inflammation and its implications for subsequent healing processes.

      This point has been addressed above. Briefly, we have now included a new experiment (and figure + video) that shows no neutrophil extravasation at earlier time points. We thank the reviewer for this helpful suggestion.

      Reviewer #2 (Recommendations for the authors):

      This paper is extremely long, and in the perspective of this reviewer, needs to be better organized.

      (1) There was a lengthy description and verification of light-induced injury and longitudinal tracking of healing, which I believe can be further cleaned up and made more succinct.

      We have cleaned-up and re-organized the manuscript (see above response for details). Manuscript has been reorganized and reduced by 8%.

      (2) The intention/goal of the paper can be further strengthened. On page 33: "to what extent do neutrophils respond to acute neural loss in the retina?" This particular statement is so clear and really brings out the purpose of this study, and it will be great to see something like this in the opening statement.

      We thank the reviewer for this excellent suggestion. We have modified the final paragraph of the introduction to strengthen our study’s intention.

      P4 L45-47: Here, we ask the question: “To what extent do microglia/neutrophils respond to acute neural loss in the retina?” To begin unraveling the complexities in this response, we deploy a deep retinal laser ablation model.

      (3) The figures are not mentioned in the manuscript in the order they were numbered. It makes it extremely challenging to follow along. The methods/results sections started with Figure 1, then on to Figure 4, then back to Figures 2 and 3, etc. This reviewer recommends re-organizing figures and their order of appearance so the contents of the figures are referred to in the paragraph in the most efficient and clear manner.

      We have re-organized the appearance of figure references throughout the paper.

      (4) Figure 2: phase contrast was not acquired on days 3, 7, and 2 months. Please briefly explain the reason in the caption.

      Addressed above.

      (5) Figure 4 OPL layer, the area highlighted in a dashed circle was meant to demonstrate that perfusion was intact, but I cannot see the flow in the highlighted area very well at day 7 and 2 months (especially 2 months). Please explain.

      Perfusion maps are often difficult to interpret as a static image. Therefore, we have additionally provided the raw video data (“OPL_vasculature_7d” and “OPL_vasculature_2mo”) which helps visualize active perfusion. To the reviewer’s point, videos reveal that RBC motion is maintained in the capillaries of this location.

      (6) While there's a thorough discussion of the biological impact of the finding, the uniqueness of the imaging technique can be better highlighted. Immune response toward injury is highly dynamic and is often the first step of wound healing. To observe such dynamic events longitudinally in the living eye at the cellular level, it requires a special imaging technique such as the type addressed here. The author can better address the technical uniqueness of studying this type of biological event for readers less familiar with AOSLO.

      We agree and following the reviewer’s suggestion have further emphasized the advance in the current manuscript in two additional places:

      (1) Within the introduction

      P3-4 L21-42: “A missed window of interaction is highly problematic in histological study where a single time point reveals a snapshot of the temporally complex immune response, which changes dynamically over time. Here, we use in vivo imaging to overcome these constraints.

      Documenting immune cell interactions in the retina over time has been challenged by insufficient resolution and contrast to visualize single cells in the living eye. The microscopic size of immune cells requires exceptional resolution for detection. Recently, advances in AOSLO imaging have provided micron-level resolution and enhanced contrast for imaging individual immune cells in the retina and without requiring extrinsic dyes(7,23). AOSLO provides multi-modal information from confocal reflectance, phase-contrast and fluorescence modalities, which can reveal a variety of cell types simultaneously in the living eye. Here, we used confocal AOSLO to track changes in reflectance at cellular scale. Phase-contrast AOSLO provides detail on highly translucent retinal structures such as vascular wall, single blood cells(27–29), PR somata(30), and is well-suited to image resident and systemic immune cells.(7,23) Fluorescence AOSLO provides the ability to study fluorescently-labeled cells(25,31,32) and exogenous dyes(27,33) throughout the living retina. These modalities used in combination have recently provided detailed images of the retinal response to a model of human uveitis.(23,34) Together, these innovations now provide a platform to visualize, for the first time, the dynamic interplay between many immune cell types, each with a unique role in tissue inflammation.”

      (2) Within the discussion

      P34-35 L656-662 “Beyond the context of this specific finding, we share this work with the excitement that AOSLO cellular level imaging may reveal the interaction of multiple immune cell types in the living retina. By using fluorophores associated with specific immune cell populations, the complex dynamics that orchestrate the immune response may be examined in this specialized tissue. This work and future studies may reveal further insights to the interactions of single immune cells in the living body in a non-invasive way.”

      Reviewer #3 (Recommendations for the authors):

      Some other comments:

      (1) The reader may wonder why if all findings are confirmed by histology would an in vivo imaging model be needed. This does not need a generalized explanation given the typical virtues of an in vivo model, but perhaps the authors may want to amplify their findings in the current context, for example, those on the shorter minutes to hours timescales (Figure 2, Supplement 1) that would have been resource and time intensive, and likely impossible, to gather via histology alone.

      The reviewer appropriately underscores the utility of in vivo imaging above histological-only investigation. In response, we have added text in the introduction to emphasize the nuanced, but important value of both longitudinal imaging as well as dynamic imaging which is not possible with conventional histology (e.g. blood perfusion status, immune cell interactions etc.)

      P3-4 L21-42 (these points also addressed in response to reviewer #2 above)

      (2) A few questions and comments on the laser ablation model<br /> - It is alluded to in the Discussion in Lines 519-521 that the procedure is highly reproducible (95%) but the associated data for this repeatability metric is not shown.

      We agree that the criterion for determining a “successful lesion” requires further elaboration. Therefore, we have now included the criteria for successful lesions in the methods as well as discussion (in bullet below):

      Methods:

      P9-10 L129-133: “This protocol produced a hyper-reflective phenotype in the >40 locations across 28 mice. In rare cases, the exposure yielded no hyper-reflective lesion and were often in mice with high retinal motion, where the light dosage was spread over a larger retinal area. These locations were not included in the in-vivo or histological analysis.”

      - The methods state that a 24 x 1-micron line is focused on the retina, but all lesions seem to appear elliptical where the major to minor axis ratio is a lot smaller than this intended size. One wonders what leads to this discrepancy.

      We expect that this observation is related to the response above, we have added the following:

      Discussion:

      P27 L497-505: “The damage took on an elliptical form, likely due to: 1) Eye motion from respiration and heart rate which spreads the light over a larger integrative area (rather than line). 2) The impact of focal light scatter. 3) A micron-thin line imparting damage on cells that are many microns across manifesting as an ellipse. The majority of light exposures produced lesions of this elliptical shape. In a few conditions, for the reasons described above, the exposure failed to produce a strong, focal damage phenotype. To improve lesion reproducibility, future experiments should control for subtle eye motion affecting light damage, especially for long exposures.”

      (3) Lastly, a thickening is noted in the ONL after laser injury that seems to cause a thinning of the INL as well (Figure 3) which may increase the apparent INL nuclei density.

      The reviewer’s careful eye finds local swelling after injury. However, despite swelling, the segregation between INL and ONL was maintained in all days we examined. Thus, no ONL cells were included in INL counts (see figure 3A & 3D).

      Also, the ONL - inner (panel B) seems to show a little reduction in cell density in the same elliptical shape as the outer ONL in panel C.

      We agree with this observation and was one of the reasons we included this detailed analysis of both the inner and outer half of the ONL. Our finding is that there is more prominent loss of nuclei in the outer half of the ONL. While the mechanism for this is not understood, we felt it was an important finding to include and further shows the axial specificity of the light damage we are inducing (especially at day 1 observation).

      Lastly, the reduction in nuclear density is visually obvious in the ONL at the 1 and 3-day time points but the p-statistic does not seem to convey this. One may consider performing the analysis on panel F on a smaller region surrounding the lesion to more reliably reveal these effects.

      Related to the response above, the ONL shows a persistence of nuclei in the upper half of that layer, whereas the outer half, shows a visible reduction. Therefore, we expect that the reviewer is correct that a statistical analysis that considers just the outer half of the ONL would likely show a strong statistical significance. The challenge, however, is that our analysis strategy counted all cells within a 50 micron diameter cylinder through the entirety of the ONL (meaning strong loss in the outer half was attenuated by weak loss in the inner half). A more detailed sub-layer analysis is challenging given the notable retinal remodeling over days-to-weeks that make it challenging to attribute layers within the ONL as viable landmarks for the requested analysis.

      (4) In Figure 6, the NIR confocal image and fluorescent microglia seem to share the same shape, starting from the OPL and posterior to it. This is particularly evident in the 3 and 7-day time points in the ONL and ONL/IS images. This departs from lines 567-577 where the claim is made that the hyperreflective phenotype in NIR images does not emerge from the microglia and neutrophils. This discrepancy should be clarified. It may be so that the hyperreflective phenotype as observed by Figure 2 at shorter timescales is not related to the microglia but the locus of hyper-reflections changes at longer time scales to involve the microglia as well as in Figure 6. One potential clue/speculation of the common shapes/size in confocal hyper-reflectance and fluorescent microglia of Figure 6 comes from Figure 9 where the microglia seem to engulf the photoreceptor phagosomes in the DAPI stains. It is possible that the hyper-reflections arise from the phagosomes but their co-localization with microglia seems to demonstrate a shared size/shape. As an addendum to the first point, such correlations are a power of the in vivo model and impossible to achieve in histology.

      The reviewer shows a deep understanding of our data. We agree with many of the points, but for the purpose of the paper many of the above offerings are speculative and we have chosen not to elaborate on these points as it is not definitive from the data. Instead, we direct the reader to an important finding that within hours, the hyper-reflective phenotype is seen in both OCT and AOSLO, whereas microglial somas/processes have not yet migrated into the hyper-reflective region. We have now emphasized this point in the discussion section:

      P29-30 L543-552: “A common speculation is that the increased backscatter may arise from local inflammatory cells that activate or move into the damage location. In our data, confocal AOSLO and OCT revealed a hyperreflective band at the OPL and ONL after 488 nm light exposure (Figure 2a, b). We found that the hyperreflective bands appeared within 30 minutes after the laser injury, preceding any detectable microglial migration toward the damage location (Figure 2 – figure supplement 1 and Figure 6 – figure supplement 1). We thus conclude that the initial hyperreflective phenotype is not caused by microglial cell activity or aggregation.”

    1. eLife Assessment

      This important work presents a self-supervised method for the segmentation of 3D cells in fluorescent microscopy images, conveniently packaged as a Napari plugin and tested on an annotated dataset. The segmentation method is solid and compares favorably to other learning-based methods and Otsu thresholding on four datasets, offering the possibility of eliminating time-consuming data labeling to speed up quantitative analysis. This work will be of interest to a wide variety of laboratories analysing fluorescently labeled images.

    2. Reviewer #1 (Public review):

      The manuscript now compares the WNet3D quantitatively against other methods on all four datasets:

      Figure 1b shows results on the mouse cortex dataset, comparing StarDist, CellPose, SegResNet, SwinUNetR against self-supervised (or learning-free methods) WNet3D and Otsu thresholding.

      Figure 2b shows results on an unnamed dataset (presumably the mouse cortex dataset), comparing StarDist, CellPose, SegResNet, SwinUNetR with different levels of training data against WNet3D.

      Figure 3 shows results on three datasets (Platynereis-ISH-Nuclei-CBG, Platynereis-Nuclei-CBG, and Mouse-Skull-Nuclei-CBG), comparing StarDist, CellPose against WNet3D and Otsu thresholding.

      It is unclear whether the Otsu thresholding baseline was given the same post-processing as the WNet3D. Figure 1b shows two versions for WNet3D ("WNet3D - No artifacts" and "WNet3D"), but only one for Otsu thresholding. Given that post-processing (or artifact removal) seems to have a substantial impact on accuracy, the authors should clarify whether the Otsu thresholding results were treated in the same way and if Otsu thresholding was not post-processed. Figure 2a would also benefit from including the thresholding results (with and without artifact removal).

    3. Reviewer #2 (Public review):

      The authors have now addressed the most important points, and they include more comprehensive evaluation of their method and comparisons to other approaches for multiple datasets.

      Some points would benefit from clarification:

      - Figure 1B now compares "Otsu thresholding", "WNet 3D - No artifacts" and "WNet 3d". Why don't you also report the score for "Otsu thresholding - No Artifacts"? To my understanding this is a post-processing operation to remove small and very large objects, so it could easily be applied to the Otsu thresholding. Given the good results for Otsu thresholding alone (quite close F1-score to WNet 3d), it seems like DL might not really be necessary at all for this dataset and including "Otsu thresholding - No artifacts" would enable evaluating this point.

      - CellPose and StarDist perform poorly in all the experiments performed by the authors. In almost all cases they underperform Otsu thresholding, which is in most cases on par with the WNet results (except for "Mouse Skull Nuclei CBG"). This is surprising and contradicts the collective expertise of the community: good CellPose and StarDist models can be trained for the 3D instance segmentation tasks studied here. Perhaps these methods were not trained in an optimal way. Seems unlikely that it is not possible to get much better CellPose or StarDist models for these tasks (current versions are on par or much worse than Otsu!), as I have applied both of these models successfully in similar settings. Specifically, it seems unlikely that the developers of CellPose or StarDist would obtain similarly poor scores on the same data (note I am not one of the developers).

      The current experiments still highlight an interesting aspect: the problem of training / fine-tuning these methods correctly on new data and the technical challenges associated with this. But the reported results should by no means be taken as a fair assessment of the capabilities of StarDist or CellPose.

      Please note that I did not have time to test the Napari plugin again, so I did not evaluate whether it improved in usability.

    4. Author response:

      The following is the authors’ response to the previous reviews

      eLife Assessment

      This work presents a valuable self-supervised method for the segmentation of 3D cells in microscopy images, alongside an implementation as a Napari plugin and an annotated dataset. While the Napari plugin is readily applicable and promises to eliminate time consuming data labeling to speed up quantitative analysis, there is incomplete evidence to support the claim that the segmentation method generalizes to other light-sheet microscopy image datasets beyond the two specific ones used here.

      Technical Note: We showed the utility of CellSeg3D in the first submission and in our revision on 5 distinct datasets; 4 of which we showed F1-Score performance on. We do not know which “two datasets” are referenced. We also already showed this is not limited to LSM, but was used on confocal images; we already limited our scope and changed the title in the last rebuttal, but just so it’s clear, we also benchmark on two non-LSM datasets.

      In this revision, we have now additionally extended our benchmarking of Cellpose and StarDrist on all 4 benchmark datasets, where our Wet3D (our novel contribution of a self-supervised model) outperforms or matches these supervised baselines. Moreover, we perform rigorous testing of our model’s generalization by training on one dataset and testing generalization to the other 3; we believe this is on par (or beyond) what most cell segmentation papers do, thus we hope that “incomplete” can now be updated.

      Public Reviews:

      Reviewer #1 (Public review):

      This work presents a self-supervised method for the segmentation of 3D cells in microscopy images, an annotated dataset, as well as a napari plugin. While the napari plugin is potentially useful, there is insufficient evidence in the manuscript to support the claim that the proposed method is able to segment cells in other light-sheet microscopy image datasets than the two specific ones used here.

      Thank you again for your time. We benchmarked already on four datasets the performance of WNet3Dd (our 3D SSL contribution) - thus, we do not know which two you refer to. Moreover, we now additionally benchmarked Cellpose and StarDist on all four so readers can see that on all datasets, WNet3D outperforms or matches these supervised methods.

      I acknowledge that the revision is now more upfront about the scope of this work. However, my main point still stands: even with the slight modifications to the title, this paper suggests to present a general method for self-supervised 3D cell segmentation in light-sheet microscopy data. This claim is simply not backed up.

      We respectfully disagree; we benchmark on four 3D datasets: three curated by others and used in learning ML conference proceedings, and one that we provide that is a new ground truth 3D dataset - the first of its kind - on mesoSPIM-acquired brain data. We believe benchmarking on four datasets is on par (or beyond) with current best practices in the field. For example, Cellpose curated one dataset and tested on held-out test data on this one dataset (https://www.nature.com/articles/s41592-020-01018-x) and benchmarked against StarDist and Mask R-CNN (two models). StarDist (Star-convex Polyhedra for 3D Object Detection and Segmentation in Microscopy) benchmarked on two datasets and against two models, IFT-Watershed and 3D U-Net. Thus, we feel our benchmarking on more models and more datasets is sufficient to claim our model and associated code is of interest to readers and supports our claims (for comparison, Cellpose’s title is “Cellpose: a generalist algorithm for cellular segmentation”, which is much broader than our claim).

      I still think the authors should spell out the assumptions that underlie their method early on (cells need to be well separated and clearly distinguishable from background). A subordinate clause like "often in cleared neural tissue" does not serve this purpose. First, it implies that the method is also suitable for non-cleared tissue (which would have to be shown). Second, this statement does not convey the crucial assumptions of well separated cells and clear foreground/background differences that the method is presumably relying on.

      We expanded the manuscript now quite significantly. To be clear, we did show our method works on non-cleared tissue; the Mouse Skull, 3D platynereis-Nuclei, and 3D platynereis-ISH-Nuclei is not cleared tissue, and not all with LSM, but rather with confocal microscopy. We attempted to make that more clear in the main text.

      Additionally, we do not believe it needs to be well separated and have a perfectly clean background. While we removed statements like "often in cleared neural tissue", expanded the benchmarking, and added a new demo figure for the readers to judge. As in the last rebuttal, we provide video-evidence (https://www.youtube.com/watch?v=U2a9IbiO7nE) of the WNet3D working on the densely packed and hard to segment by a human, Mouse Skull dataset and linked this directly in the figure caption.

      We have re-written the main manuscript in an attempt to clarify the limitations, including a dedicated “limitations” section. Thank you for the suggestion.

      It does appear that the proposed method works very well on the two investigated datasets, compared to other pre-trained or fine-tuned models. However, it still remains unclear whether this is because of the proposed method or the properties of those specific datasets (namely: well isolated cells that are easily distinguished from the background). I disagree with the authors that a comparison to non-learning methods "is unnecessary and beyond the scope of this work". In my opinion, this is exactly what is needed to proof that CellSeg3D's performance can not be matched with simple image processing.

      We want to again stress we benchmarked WNet3D on four datasets, not two. But now additionally added benchmarking with Cellpose, StarDist and a non-deep learning method as requested (see new Figures 1 and 3).

      As I mentioned in the original review, it appears that thresholding followed by connected component analysis already produces competitive segmentations. I am confused about the authors' reply stating that "[this] is not the case, as all the other leading methods we fairly benchmark cannot solve the task without deep learning". The methods against which CellSeg3D is compared are CellPose and StarDist, both are deep-learning based methods.

      That those methods do not perform well on this dataset does not imply that a simpler method (like thresholding) would not lead to competitive results. Again, I strongly suggest the authors include a simple, non-learning based baseline method in their analysis, e.g.: * comparison to thresholding (with the same post-processing as the proposed method) * comparison to a normalized cut segmentation (with the same post-processing as the proposed method)

      We added a non-deep learning based approach, namely, comparing directly to thresholding with the same post hoc approach we use to go from semantic to instance segmentation. WNet3D (and other deep learning approaches) perform favorably (see Figure 2 and 3).

      Regarding my feedback about the napari plugin, I apologize if I was not clear. The plugin "works" as far as I tested it (i.e., it can be installed and used without errors). However, I was not able to recreate a segmentation on the provided dataset using the plugin alone (see my comments in the original review). I used the current master as available at the time of the original review and default settings in the plugin.

      We updated the plugin and code for the revision at your request to make this possible directly in the napari GUI in addition to our scripts and Jupyter Notebooks (please see main and/or `pip install --upgrade napari-cellseg3d`’ the current is version 0.2.1). Of course this means the original submission code (May 2024) will not have this in the GUI so it would require you to update to test this. Alternatively, you can see the demo video we now provide for ease: https://www.youtube.com/watch?v=U2a9IbiO7nE (we understand testing code takes a lot of time and commitment).

      We greatly thank the review for their time, and we hope our clarifications, new benchmarking, and re-write of the paper now makes them able to change their assessment from incomplete to a more favorable and reflective eLife adjective.

      Reviewer #2 (Public review):

      Summary:

      The authors propose a new method for self-supervised learning of 3d semantic segmentation for fluorescence microscopy. It is based on a WNet architecture (Encoder / Decoder using a UNet for each of these components) that reconstructs the image data after binarization in the bottleneck with a soft n-cuts clustering. They annotate a new dataset for nucleus segmentation in mesoSPIM imaging and train their model on this dataset. They create a napari plugin that provides access to this model and provides additional functionality for training of own models (both supervised and self-supervised), data labeling and instance segmentation via post-processing of the semantic model predictions. This plugin also provides access to models trained on the contributed dataset in a supervised fashion.

      Strengths:

      -  The idea behind the self-supervised learning loss is interesting.

      -  It provides a new annotated dataset for an important segmentation problem.

      -  The paper addresses an important challenge. Data annotation is very time-consuming for 3d microscopy data, so a self-supervised method that yields similar results to supervised segmentation would provide massive benefits.

      -  The comparison to other methods on the provided dataset is extensive and experiments are reproducible via public notebooks.

      Weaknesses:

      The experiments presented by the authors support the core claims made in the paper. However, they do not convincingly prove that the method is applicable to segmentation problems with more complex morphologies or more crowded cells/nuclei.

      Major weaknesses:

      (1) The method only provides functionality for semantic segmentation outputs and instance segmentation is obtained by morphological post-processing. This approach is well known to be of limited use for segmentation of crowded objects with complex morphology. This is the main reason for prediction of additional channels such as in StarDist or CellPose. The experiments do not convincingly show that this limitation can be overcome as model comparisons are only done on a single dataset with well separated nuclei with simple morphology. Note that the method and dataset are still a valuable contribution with this limitation, which is somewhat addressed in the conclusion. However, I find that the presentation is still too favorable in terms of the presentation of practical applications of the method, see next points for details.

      Thank you for noting the methods strengths and core features. Regarding weaknesses, we have revised the manuscript again and added direct benchmarking now on four datasets and a fifth “worked example” (https://www.youtube.com/watch?v=3UOvvpKxEAo&t=4s) in a new Figure 4.

      We also re-wrote the paper to more thoroughly present the work (previously we adhered to the “Brief Communication” eLife format), and added an explicit note in the results about model assumptions.

      (2) The experimental set-up for the additional datasets seems to be unrealistic as hyperparameters for instance segmentation are derived from a grid search and it is unclear how a new user could find good parameters in the plugin without having access to already annotated ground-truth data or an extensive knowledge of the underlying implementations.

      We agree that of course with any self-supervised method the user will need a sense of what a good outcome looks like; that is why we provide Google Colab Notebooks

      (https://github.com/AdaptiveMotorControlLab/CellSeg3D/tree/main/notebooks) and the napari-plugin GUI for extensive visualization and even the ability to manually correct small subsets of the data and refine the WNet3D model.

      We attempted to make this more clear with a new Figure 2 and additional functionality directly into the plugin (such as the grid search). But, we believe this “trade-off” for SSL approaches over very labor intensive 3D labeling is often worth it; annotators are also biased so extensive checking of any GT data is equally required.

      We also added the “grid search” functionality in the GUI (please `pip install --upgrade napari-cellseg3d`; the latest v0.2.1) to supplement the previously shared Notebook (https://github.com/C-Achard/cellseg3d-figures/blob/main/thresholds_opti/find_best_threshold s.ipynb) and added a new YouTube video: https://www.youtube.com/watch?v=xYbYqL1KDYE.

      (3) Obtaining segmentation results of similar quality as reported in the experiments within the napari plugin was not possible for me. I tried this on the "MouseSkull" dataset that was also used for the additional results in the paper.

      Again we are sorry this did not work for you, but we added new functionality in the GUI and made a demo video (https://www.youtube.com/watch?v=U2a9IbiO7nE) where you either update your CellSeg3D code or watch the video to see how we obtained these results.

      Here, I could not find settings in the "Utilities->Convert to instance labels" widget that yielded good segmentation quality and it is unclear to me how a new user could find good parameter settings. In more detail, I cannot use the "Voronoi-Otsu" method due to installation issues that are prohibitive for a non expert user and the "Watershed" segmentation method yields a strong oversegmentation.

      Sorry to hear of the installation issue with Voronoi-Otsu; we updated the documentation and the GUI to hopefully make this easier to install. While we do not claim this code is for beginners, we do aim to be a welcoming community, thus we provide support on GitHub, extensive docs, videos, the GUI, and Google Colab Notebooks to help users get started.

      Comments on revised version

      Many of my comments were addressed well:

      -  It is now clear that the results are reproducible as they are well documented in the provided notebooks, which are now much more prominently referenced in the text.

      Thanks!

      -  My concerns about an unfair evaluation compared to CellPose and StarDist were addressed. It is now clear that the experiments on the mesoSPIM dataset are extensive and give an adequate comparison of the methods.

      Thank you; to note we additionally added benchmarking of Cellpose and StarDist on the three additional datasets (for R1), but hopefully this serves to also increase your confidence in our approach.

      -  Several other minor points like reporting of the evaluation metric are addressed.

      I have changed my assessment of the experimental evidence to incomplete/solid and updated the review accordingly. Note that some of my main concerns with the usability of the method for segmentation tasks with more complex morphology / more crowded cells and with the napari plugin still persist. The main points are (also mentioned in Weaknesses, but here with reference to the rebuttal letter):

      - Method comparison on datasets with more complex morphology etc. are missing. I disagree that it is enough to do this on one dataset for a good method comparison.

      We benchmarked WNet3D (our contribution) on four datasets, and to aid the readers we additionally now added Cellpose and StarDist benchmarking on all four. WNet3D performs favorably, even on the crowded and complex Mouse Skull data. See the new Figure 3 as well as the associated video: https://www.youtube.com/watch?v=U2a9IbiO7nE&t=1s.

      -  The current presentation still implies that CellSeg3d **and the napari plugin** work well for a dataset with complex nucleus morphology like the Mouse Skull dataset. But I could not get this to work with the napari plugin, see next points.

      - First, deriving hyperparameters via grid search may lead to over-optimistic evaluation results. How would a user find these parameters without having access to ground-truth? Did you do any experiments on the robustness of the parameters?

      -  In my own experiments I could not do this with the plugin. I tried this again, but ran into the same problems as last time: pyClesperanto does not work for me. The solution you link requires updating openCL drivers and the accepted solution in the forum post is "switch to a different workstation".

      We apologize for the confusion here; the accepted solution (not accepted by us) was user specific as they switched work stations and it worked, so that was their solution. Other comments actually solved the issue as well. For ease this package can be installed on Google Colab (here is the link from our repo for ease: https://colab.research.google.com/github/AdaptiveMotorControlLab/CellSeg3d/blob/main/not ebooks/Colab_inference_demo.ipynb) where pyClesperanto can be installed via: !pip install pyclesperanto-prototype without issue on Google Colab.

      This a) goes beyond the time I can invest for a review and b) is unrealistic to expect computationally inexperienced users to manage. Then I tried with the "watershed" segmentation, but this yields a strong oversegmentation no matter what I try, which is consistent with the predictions that look like a slightly denoised version of the input images and not like a proper foreground-background segmentation. With respect to the video you provide: I would like to see how a user can do this in the plugin without having a prior knowledge on good parameters or just pasting code, which is again not what you would expect a computationally unexperienced user to do.

      We agree with the reviewer that the user needs domain knowledge, but we never claim our method was for inexperienced users. Our main goal was to show a new computer vision method with self-supervised learning (WNet3D) that works on LSM and confocal data for cell nuclei. To this end, we made you a demo video to show how a user can visually perform a thresholding check https://www.youtube.com/watch?v=xYbYqL1KDYE&t=5s, and we added all of these new utilities to the GUI, thanks for the suggestion. Otherwise, the threshold can also be done in a Notebook (as previously noted).

      I acknowledge that some of these points are addressed in the limitations, but the text still implies that it is possible to get good segmentation results for such segmentation problems: "we believe that our self-supervised semantic segmentation model could be applied to more challenging data as long as the above limitations are taken into account." From my point of view the evidence for this is still lacking and would need to be provided by addressing the points raised above for me to further raise the Incomplete/solid rating, especially showing how this can be done wit the napari plugin. As an alternative, I would also consider raising it if the claims are further reduced and acknowledge that the current version of the method is only a good method for well separated nuclei.

      We hope our new benchmarking and clear demo on four datasets helps improve your confidence in our evidence in our approach. We also refined our over text and hope our contributions, the limitations and the advantages are now more clear.

      I understand that this may be frustrating, but please put yourself in the role of a new reader of this work: the impression that is made is that this is a method that can solve 3D segmentation tasks in light-sheet microscopy with unsupervised learning. This would be a really big achievement! The wording in the limitation section sounds like strategic disclaimers that imply that it is still possible to do this, just that it wasn't tested enough.

      But, to the best of my assessment, the current version of the method only enables the more narrow case of well separated nuclei with a simple morphology. This is still a quite meaningful achievement, but more limited than the initial impression. So either the experimental evidence needs to be improved, including a demonstration how to achieve this in practice, including without deriving parameters via grid-search and in the plugin, or the claim needs to be meaningfully toned down.

      Thanks for raising this point; we do think that WNet3D and the associated CellSeg3D package - aimed to continue to integrate state of the art models, is a non-trivial step forward. Have we completely solved the problem, certainly not, but given the limited 3D cell segmentation tools that exist, we hope this, coupled with our novel 3D dataset, pushes the field forward. We don’t show it works on the narrow well-separated use case, but rather show this works even better than supervised models on the very challenging benchmark Mouse Skull. Given we now show evidence that we outperform or match supervised algorithms with an unsupervised approach, we respectfully do think this is a noteworthy achievement. Thank you for your time in assessing our work.

    1. eLife Assessment

      This important work advances our understanding of the aging trajectory and heterogeneity of hippocampal microglia. The authors provide an in-depth characterization of microglia in young and old mice as well as at intermediate time points, which reveals the existence of intermediate states characterized by a distinct transcriptional signature. The experimental approach is solid, especially with the validation of scRNA-seq findings with other methods. The study should be of interest to neuroimmunologists and biologists interested in aging

    2. Reviewer #2 (Public review):

      Summary:

      The goal of the paper was to trace the transitions hippocampal microglia undergo along aging. ScRNA-seq analysis allowed the authors to predict a trajectory and hypothesize about possible molecular checkpoints, which keep the pace of microglial aging. E.g. TGF1b was predicted as a molecule slowing down the microglial aging path and indeed, loss of TGF1 in microglia led to premature microglia aging, which was associated with premature loss of cognitive ability. The authors also used the parabiosis model to show how peripheral, blood-derived signals from the old organism can "push" microglia forward on the aging path.

      Strengths:

      A major strength and uniqueness of this work is the in-depth single-cell dataset, which may be a useful resource for the community, as well as the data showing what happens to young microglia in heterochronic parabiosis setting and upon loss of TGFb in their environment.

      Weaknesses:

      All weaknesses were addressed during revision.

      Overall:

      In general, I think the authors did a good job following the initial observations and devised clever ways to test the emerging hypotheses. The resulting data are an important addition to what we know about microglial aging and can be fruitfully used by other researchers, e.g. those working on microglia in a disease context.

      Comments on revisions:

      All my comments were addressed.

    1. eLife Assessment

      This useful study examines the neural activity in the motor cortex as a monkey reaches to intercept moving targets, focusing on how tuned single neurons contribute to an interesting overall population geometry. The presented results and analyses are solid, though the investigation of this novel task could be strengthened by clarifying the assumptions behind the single neuron analyses, and further analyses of the neural population activity and its relation to different features of behaviour.

    2. Reviewer #1 (Public review):

      Summary:

      This study addresses the question of how task-relevant sensory information affects activity in motor cortex. The authors use various approaches to address this question, looking at single units and population activity. They find that there are three subtypes of modulation by sensory information at the single unit level. Population analyses reveal that sensory information affects the neural activity orthogonally to motor output. The authors then compare both single unit and population activity to computational models to investigate how encoding of sensory information at the single-unit level is coordinated in a network. They find that an RNN that displays similar orbital dynamics and sensory modulation to motor cortex also contains nodes that are modulated similarly to the three subtypes identified by the single unit analysis.

      Strengths:

      The strengths of this study lie in the population analyses and the approach of comparing single-unit encoding to population dynamics. In particular, the analysis in Figure 3 is very elegant and informative about the effect of sensory information on motor cortical activity. The task is also well designed to suit the questions being asked and well controlled.

      It is commendable that the authors compare single-unit to population modulation. The addition of the RNN model and perturbations strengthen the conclusion that the subtypes of individual units all contribute to the population dynamics.

      Weaknesses:

      The main weaknesses of the study lie in the categorization of the single units into PD shift, gain and addition types. The single units exhibit clear mixed selectivity, as the authors highlight. Therefore, the subsequent analyses looking only at the individual classes in the RNN are a little limited. Another weakness of the paper is that the choice of windows for analyses is not properly justified and the dependence of the results on the time windows chosen for single unit analyses is not assessed. This is particularly pertinent because tuning curves are known to rotate during movements (Sergio et al. 2005 Journal of Neurophysiology).

      This study uses insights from single-unit analysis to inform mechanistic models of these population dynamics, which is a powerful approach, but is dependent on the validity of the single-cell analysis, which I have expanded on below.

      I have clarified some of the areas that would benefit from further analysis below:

      Task:

      The task is well designed, although it would have benefited from perhaps one more target speed (for each direction). One monkey appears to have experienced one more target speed than the others (seen in Figure 3C). It would have been nice to have this data for all monkeys, although, of course, unfeasible given that the study has been concluded.

      Single unit analyses:

      The choice of the three categories (PD shift, gain addition) is not completely justified in a satisfactory way. It would be nice to see whether these three main categories are confirmed by unsupervised methods.

      The decoder analyses in Figure 2 provide evidence that target speed modulation may change over the trial. Therefore, it is important to see how the window considered for the firing rate in Figure 1 (currently 100ms pre - 100ms post movement onset) affects the results. Whilst it is of course understandable that a window must be chosen and will always be slightly arbitrary, using different windows and comparing the results of two or three different sizes or timed windows would be more convincing that the results are not dependent on this particular window.

      RNN:

      Mixed selectivity is not analysed in the RNN, which would help to compare the model to the real data where mixed selectivity is common. The CCA and Procrustes analysis are a good start to validate the claim of similarity between RNN and neural dynamics, rather than allowing comparisons to be dominated by geometric similarities that may be features of the task. However, some of the disparity values for the Procrustes analysis are quite high, albeit below that of the shuffle. Maybe a comment about this in the text should be included. There is also an absence of alternate models to compare the perturbation model results to.

    3. Reviewer #2 (Public review):

      Summary:

      In this manuscript, Zhang et al. examine neural activity in motor cortex as monkeys make reaches in a novel target interception task. Zhang et al. begin by examining the single neuron tuning properties across different moving target conditions, finding several classes of neurons: those that shift their preferred direction, those that change their modulation gain, and those that shift their baseline firing rates. The authors go on to find an interesting, tilted ring structure of the neural population activity, depending on the target speed, and find that 1) the reach direction has consistent positioning around the ring, and 2) the tilt of the ring is highly predictive of the target movement speed. The authors then model the neural activity with a single neuron representational model and a recurrent neural network model, concluding that this population structure requires a mixture of the three types of single neurons described at the beginning of the manuscript.

      Strengths:

      I find the task the authors present here to be novel and exciting. It slots nicely into an overall trend to break away from a simple reach-to-static-target tasks to better characterize the breadth of how motor cortex generates movements. I also appreciate the movement from single neuron characterization to population activity exploration, which generally serves to anchor the results and make them concrete. Further, the orbital ring structure of population activity is fascinating, and the modeling work at the end serves as a useful baseline control to see how it might arise.

      Weaknesses:

      While I find the behavioral task presented here to be excitingly novel, I find the presented analyses and results to be far less interesting than they could be. Key to this, I think, is that the authors are examining this task and related neural activity primarily with a single-neuron representational lens. This would be fine as an initial analysis, since the population activity is of course composed of individual neurons, but the field seems to have largely moved towards a more abstract "computation through dynamics" framework that has, in the last several years, provided much more understanding of motor control than the representational framework has. As the manuscript stands now, I'm not entirely sure what interpretation to take away from the representational conclusions the authors made (i.e. the fact that the orbital population geometry arises from a mixture of different tuning types). As such, by the end of the manuscript, I'm not sure I understand any better how motor cortex or its neural geometry might be contributing to the execution of this novel task.

      Main Comments:

      My main suggestions to the authors revolve around bringing in the computation through a dynamics framework to strengthen their population results. The authors cite the Vyas et al. review paper on the subject, so I believe they are aware of this framework. I have three suggestions for improving or adding to the population results:

      (1) Examination of delay period activity: one of the most interesting aspects of the task was the fact that the monkey had a random-length delay period before he could move to intercept the target. Presumably, the monkey had to prepare to intercept at any time between 400 and 800 ms, which means that there may be some interesting preparatory activity dynamics during this period. For example, after 400ms, does the preparatory activity rotate with the target such that once the go cue happens, the correct interception can be executed? There is some analysis of the delay period population activity in the supplement, but it doesn't quite get at the question of how the interception movement is prepared. This is perhaps the most interesting question that can be asked with this experiment, and it's one that I think may be quite novel for the field--it is a shame that it isn't discussed.

      (2) Supervised examination of population structure via potent and null spaces: simply examining the first three principal components revealed an orbital structure, with a seemingly conserved motor output space and a dimension orthogonal to it that relates to the visual input. However, the authors don't push this insight any further. One way to do that would be to find the "potent space" of motor cortical activity by regression to the arm movement and examine how the tilted rings look in that space. Presumably, then, the null space should contain information about the target movement. The ring tilt will likely be evident if the authors look at the highest variance neural dimension orthogonal to the potent space (the "null space")--this is akin to PC3 in the current figures, but it would be nice to see what comes out when you look in the data for it.

      The authors attempt this sort of analysis in the supplement, alongside their dPCA results, but the results seem misinterpreted. The authors do identify one kind of output-potent space using the reach direction components of dPCA, and the reach directions are indeed aligned here. However, they then go on to interpret the target-velocity space as the output-null space, orthogonal to the potent space. There are two problems with this. 1) The target-velocity space is not necessarily orthogonal to the reach-direction space. This is a key aspect of dPCA--while the individual components within a particular marginalization space are orthogonal, the marginalization spaces themselves are not necessarily orthogonal unless they are forced to be (which the authors don't mention doing). 2) Even if the target-velocity space were orthogonal to the reach-direction space, it would not comprise the whole output-null space--such a null space would also include dimensions of neural population activity that have target-velocity/reach-direction interaction, which the authors show is a major component of neural population variance. Incidentally, the dPCA analysis the authors present shows what I would expect from their unsupervised results, but as it is written, the dPCA results are interpreted in a strange or potentially misleading way.

      (3) RNN perturbations: as it's currently written, the RNN modeling has promise, but the perturbations performed don't provide me with much insight. I think this is because the authors are trying to use the RNN to interpret the single neuron tuning, but it's unclear to me what was learned from perturbing the connectivity between what seems to me almost arbitrary groups of neurons. It seems to me that a better perturbation might be to move the neural state before the movement onset to see how it changes the output. For example, the authors could move the neural state from one tilted ring to another to see if the virtual hand then reaches a completely different (yet predictable) target. Moreover, if the authors can more clearly characterize the preparatory movement, perhaps perturbations in the delay period would provide even more insight into how the interception might be prepared.

    4. Reviewer #3 (Public review):

      Summary:

      This experimental study investigates the influence of sensory information on neural population activity in M1 during a delayed reaching task. In the experiment, monkeys are trained to perform a delayed interception reach task, in which the goal is to intercept a potentially moving target.

      This paradigm allows the authors to investigate how, given a fixed reach end point (which is assumed to correspond to a fixed motor output), the sensory information regarding the target motion is encoded in neural activity.

      At the level of single neurons, the authors find that target motion modulates the activity is three main ways: gain modulation (scaling of the neural activity depending on the target direction), shift (shift of the preferred direction of neurons tuned to reach direction), or addition (offset to the neural activity).

      At the level of the neural population, target motion information was largely encoded along the 3rd PC of the neural activity, leading to a tilt of the manifold along which reach direction was encoded that was proportional to target speed. The tilt of the neural manifold was found to be largely driven by the variation of activity of the population of gain modulated neurons.

      Finally, the authors study the behaviour of an RNN trained to generate the correct hand velocity given the sensory input and reach direction. The RNN units are found to similarly exhibit mixed selectivity to the sensory information, and the geometry of the « neural population » resembles that observed in the monkeys.

      Overall, the experiment is well set up to address the question of how sensory information that is directly relevant to the behaviour but does not lead to a direct change in behavioural output modulates motor cortical activity.<br /> The finding that sensory information modulates the neural activity in M1 during motor preparation and execution is non trivial, given that this modulation of the activity must occur in the nullspace of the movement.<br /> The authors provide analyses at both the single neuron and the population level, leading to a relatively complete characterization of the effect of the target motion on neural activity.<br /> Additionally, they start exploring the link between the population geometry and the mixed selectivity of the single neurons in their RNN model. While they could be extended in future work, the analyses of the RNN provide a good starting point to address how exactly the task setup and constraints on the network shape the single neuron selectivity and the population geometry.

    5. Author response:

      The following is the authors’ response to the original reviews

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This study addresses the question of how task-relevant sensory information affects activity in the motor cortex. The authors use various approaches to address this question, looking at single units and population activity. They find that there are three subtypes of modulation by sensory information at the single unit level. Population analyses reveal that sensory information affects the neural activity orthogonally to motor output. The authors then compare both single unit and population activity to computational models to investigate how encoding of sensory information at the single unit level is coordinated in a network. They find that an RNN that displays similar orbital dynamics and sensory modulation to the motor cortex also contains nodes that are modulated similarly to the three subtypes identified by the single unit analysis.

      Strengths:

      The strengths of this study lie in the population analyses and the approach of comparing single-unit encoding to population dynamics. In particular, the analysis in Figure 3 is very elegant and informative about the effect of sensory information on motor cortical activity.

      The task is also well designed to suit the questions being asked and well controlled.

      We appreciate these kind comments.

      It is commendable that the authors compare single units to population modulation. The addition of the RNN model and perturbations strengthen the conclusion that the subtypes of individual units all contribute to the population dynamics. However, the subtypes (PD shift, gain, and addition) are not sufficiently justified. The authors also do not address that single units exhibit mixed modulation, but RNN units are not treated as such.

      We’re sorry that we didn’t provide sufficient grounds to introduce the subtypes. We have updated this in the revised manuscript, in Lines 102-104 as:

      “We determined these modulations on the basis of the classical cosine tuning model (Georgopoulos et al., 1982) and several previous studies (Bremner and Andersen, 2012; Pesaran et al., 2010; Sergio et al., 2005).”

      In our study, we applied the subtype analysis as a criterion to identify the modulation in neuron populations, rather than sorting neurons into exclusively different cell types.

      Weaknesses:

      The main weaknesses of the study lie in the categorization of the single units into PD shift, gain, and addition types. The single units exhibit clear mixed selectivity, as the authors highlight. Therefore, the subsequent analyses looking only at the individual classes in the RNN are a little limited. Another weakness of the paper is that the choice of windows for analyses is not properly justified and the dependence of the results on the time windows chosen for single-unit analyses is not assessed. This is particularly pertinent because tuning curves are known to rotate during movements (Sergio et al. 2005 Journal of Neurophysiology).

      In our study, the mixed selectivity or specifically the target-motion modulation on reach- direction tuning is a significant feature of the single neurons. We categorized the neurons into three subclasses, not intending to claim their absolute cell types, but meaning to distinguish target-motion modulation patterns. To further characterize these three patterns, we also investigated their interaction by perturbing connection weights in RNN.

      Yes, it’s important to consider the role of rotating tuning curves in neural dynamics during interception. In our case, we observed population neural state with sliding windows, and we focused on the period around movement onset (MO) due to the unexpected ring-like structure and the highest decoding accuracy of transferred decoders (Figure S7C). Then, the single-unit analyses were implemented.

      This paper shows sensory information can affect motor cortical activity whilst not affecting motor output. However, it is not the first to do so and fails to cite other papers that have investigated sensory modulation of the motor cortex (Stavinksy et al. 2017 Neuron, Pruszynski et al. 2011 Nature, Omrani et al. 2016 eLife). These studies should be mentioned in the Introduction to capture better the context around the present study. It would also be beneficial to add a discussion of how the results compare to the findings from these other works.

      Thanks for the reminder. We’ve introduced these relevant researches in the updated manuscript in Lines 422-426 as:

      “To further clarify, the discussing target-motion effect is different from the sensory modulation in action selection (Cisek and Kalaska, 2005), motor planning (Pesaran et al., 2006), visual replay and somatosensory feedback (Pruszynski et al., 2011; Stavisky et al., 2017; Suway and Schwartz, 2019; Tkach et al., 2007), because it occurred around movement onset and in predictive control trial-by-trial.”

      This study also uses insights from single-unit analysis to inform mechanistic models of these population dynamics, which is a powerful approach, but is dependent on the validity of the single-cell analysis, which I have expanded on below.

      I have clarified some of the areas that would benefit from further analysis below:

      (1) Task:

      The task is well designed, although it would have benefited from perhaps one more target speed (for each direction). One monkey appears to have experienced one more target speed than the others (seen in Figure 3C). It would have been nice to have this data for all monkeys.

      A great suggestion; however, it is hardly feasible as the Utah arrays have already been removed.

      (2) Single unit analyses:

      In some analyses, the effects of target speed look more driven by target movement direction (e.g. Figures 1D and E). To confirm target speed is the main modulator, it would be good to compare how much more variance is explained by models including speed rather than just direction. More target speeds may have been helpful here too.

      A nice suggestion. The fitting goodness of the simple model (only movement direction) is much worse than the complex models (including target speed). We’ve updated the results in the revised manuscript in Lines 119-122, as “We found that the adjusted R2 of a full model (0.55 ± 0.24, mean ± sd.) can be higher than that of the PD shift (0.47 ± 0.24), gain (0.46 ± 0.22), additive (0.41 ± 0.26), and simple models (only reach direction, 0.34 ± 0.25) for three monkeys (1162 neurons, ranksum test, one-tailed, p<0.01, Figure S5).”

      The choice of the three categories (PD shift, gain addition) is not completely justified in a satisfactory way. It would be nice to see whether these three main categories are confirmed by unsupervised methods.

      A good point. It is a pity that we haven’t found an appropriate unsupervised method.

      The decoder analyses in Figure 2 provide evidence that target speed modulation may change over the trial. Therefore, it is important to see how the window considered for the firing rate in Figure 1 (currently 100ms pre - 100ms post movement onset) affects the results.

      Thanks for the suggestion and close reading. Because the movement onset (MO) is the key time point of this study, we colored this time period in Figure 1 to highlight the perimovement neuronal activity.

      (3) Decoder:

      One feature of the task is that the reach endpoints tile the entire perimeter of the target circle (Figure 1B). However, this feature is not exploited for much of the single-unit analyses. This is most notable in Figure 2, where the use of a SVM limits the decoding to discrete values (the endpoints are divided into 8 categories). Using continuous decoding of hand kinematics would be more appropriate for this task.

      This is a very reasonable suggestion. In the revised manuscript, we’ve updated the continuous decoding results with support vector regression (SVR) in Figure S7A and in Lines 170-173 as:

      “These results were stable on the data of the other two monkeys and the pseudopopulation of all three monkeys (Figure S6) and reconfirmed by the continuous decoding results with support vector regressions (Figure S7A), suggesting that target motion information existed in M1 throughout almost the entire trial.”

      (4) RNN:

      Mixed selectivity is not analysed in the RNN, which would help to compare the model to the real data where mixed selectivity is common. Furthermore, it would be informative to compare the neural data to the RNN activity using canonical correlation or Procrustes analyses. These would help validate the claim of similarity between RNN and neural dynamics, rather than allowing comparisons to be dominated by geometric similarities that may be features of the task. There is also an absence of alternate models to compare the perturbation model results to.

      Thank you for these helpful suggestions. We have performed decoding analysis on RNN units and updated in Figure S12A and Lines 333-334 as: “First, from the decoding result, target motion information existed in nodes’ population dynamics shortly after TO (Figure S12A).”

      We also have included the results of canonical correlation analysis and Procrustes analysis in Table S2 and Lines 340-342 as: “We then performed canonical component analysis (CCA) and Procrustes analysis (Table S2; see Methods), the results also indicated the similarity between network dynamics and neural dynamics.”

      Reviewer #2 (Public Review):

      Summary:

      In this manuscript, Zhang et al. examine neural activity in the motor cortex as monkeys make reaches in a novel target interception task. Zhang et al. begin by examining the single neuron tuning properties across different moving target conditions, finding several classes of neurons: those that shift their preferred direction, those that change their modulation gain, and those that shift their baseline firing rates. The authors go on to find an interesting, tilted ring structure of the neural population activity, depending on the target speed, and find that (1) the reach direction has consistent positioning around the ring, and (2) the tilt of the ring is highly predictive of the target movement speed. The authors then model the neural activity with a single neuron representational model and a recurrent neural network model, concluding that this population structure requires a mixture of the three types of single neurons described at the beginning of the manuscript.

      Strengths:

      I find the task the authors present here to be novel and exciting. It slots nicely into an overall trend to break away from a simple reach-to-static-target task to better characterize the breadth of how the motor cortex generates movements. I also appreciate the movement from single neuron characterization to population activity exploration, which generally serves to anchor the results and make them concrete. Further, the orbital ring structure of population activity is fascinating, and the modeling work at the end serves as a useful baseline control to see how it might arise.

      Thank you for your recognition of our work.

      Weaknesses:

      While I find the behavioral task presented here to be excitingly novel, I find the presented analyses and results to be far less interesting than they could be. Key to this, I think, is that the authors are examining this task and related neural activity primarily with a singleneuron representational lens. This would be fine as an initial analysis since the population activity is of course composed of individual neurons, but the field seems to have largely moved towards a more abstract "computation through dynamics" framework that has, in the last several years, provided much more understanding of motor control than the representational framework has. As the manuscript stands now, I'm not entirely sure what interpretation to take away from the representational conclusions the authors made (i.e. the fact that the orbital population geometry arises from a mixture of different tuning types). As such, by the end of the manuscript, I'm not sure I understand any better how the motor cortex or its neural geometry might be contributing to the execution of this novel task.

      This paper shows the sensory modulation on motor tuning in single units and neural population during motor execution period. It’s a pity that the findings were constrained in certain time windows. We are still working on this task, please look forward to our following work.

      Main Comments:

      My main suggestions to the authors revolve around bringing in the computation through a dynamics framework to strengthen their population results. The authors cite the Vyas et al. review paper on the subject, so I believe they are aware of this framework. I have three suggestions for improving or adding to the population results:

      (1) Examination of delay period activity: one of the most interesting aspects of the task was the fact that the monkey had a random-length delay period before he could move to intercept the target. Presumably, the monkey had to prepare to intercept at any time between 400 and 800 ms, which means that there may be some interesting preparatory activity dynamics during this period. For example, after 400ms, does the preparatory activity rotate with the target such that once the go cue happens, the correct interception can be executed? There is some analysis of the delay period population activity in the supplement, but it doesn't quite get at the question of how the interception movement is prepared. This is perhaps the most interesting question that can be asked with this experiment, and it's one that I think may be quite novel for the field--it is a shame that it isn't discussed.

      It’s a great idea! We are on the way, and it seems promising.

      (2) Supervised examination of population structure via potent and null spaces: simply examining the first three principal components revealed an orbital structure, with a seemingly conserved motor output space and a dimension orthogonal to it that relates to the visual input. However, the authors don't push this insight any further. One way to do that would be to find the "potent space" of motor cortical activity by regression to the arm movement and examine how the tilted rings look in that space (this is actually fairly easy to see in the reach direction components of the dPCA plot in the supplement--the rings will be highly aligned in this space). Presumably, then, the null space should contain information about the target movement. dPCA shows that there's not a single dimension that clearly delineates target speed, but the ring tilt is likely evident if the authors look at the highest variance neural dimension orthogonal to the potent space (the "null space")-this is akin to PC3 in the current figures, but it would be nice to see what comes out when you look in the data for it.

      Thank you for this nice suggestion. While it was feasible to identify potent subspaces encoding reach direction and null spaces for target-velocity modulation, as suggested by the reviewer, the challenge remained that unsupervised methods were insufficient to isolate a pure target-velocity subspace from numerous possible candidates due to the small variance of target-velocity information. Although dPCA components can be used to construct orthogonal subspaces for individual task variables, we found that the targetvelocity information remained highly entangled with reach-direction representation. More details can be found in Figure S8C and its caption as below:

      “We used dPCA components with different features to construct three subspaces (same data in A, reach-direction space #3, #4, #5; target-velocity space #10, #15, #17; interaction space #6, #11, #12), and we projected trial-averaged data into these orthogonal subspaces using different colormaps. This approach allowed us to obtain a “potent subspace” coding reach direction and a “null space” for target velocity. The results showed that the reach-direction subspace effectively represented the reach direction. However, while the target-velocity subspace encoded the target velocity information, it still contained reach-direction clusters within each target-velocity condition, corroborating the results of the addition model in the main text (Figure 4). The interaction subspace revealed that multiple reach-direction rings were nested within each other, similar to the findings from the gain model (Figure 3 & 4). The interaction subspace also captured more variance than target-velocity subspace, consistent with our PCA results, suggesting the target-velocity modulation primarily coexists with reach-direction coding. Furthermore, we explored alternative methods to verify whether orthogonal subspaces could effectively separate the reach direction and target velocity. We could easily identify the reach-direction subspace, but its orthogonal subspace was relatively large, and the target-velocity information exhibited only small variance, making it difficult to isolate a subspace that purely encodes target velocity.”

      (3) RNN perturbations: as it's currently written, the RNN modeling has promise, but the perturbations performed don't provide me with much insight. I think this is because the authors are trying to use the RNN to interpret the single neuron tuning, but it's unclear to me what was learned from perturbing the connectivity between what seems to me almost arbitrary groups of neurons (especially considering that 43% of nodes were unclassifiable). It seems to me that a better perturbation might be to move the neural state before the movement onset to see how it changes the output. For example, the authors could move the neural state from one tilted ring to another to see if the virtual hand then reaches a completely different (yet predictable) target. Moreover, if the authors can more clearly characterize the preparatory movement, perhaps perturbations in the delay period would provide even more insight into how the interception might be prepared.

      We are sorry that we did not clarify the definition of “none” type, which can be misleading. The 43% unclassifiable nodes include those inactive ones; when only activate (taskrelated) nodes included, the ratio of unclassifiable nodes would be much lower. We recomputed the ratios with only activated units and have updated Table 1. By perturbing the connectivity, we intended to explore the interaction between different modulations.

      Thank you for the great advice. We considered moving neural states from one ring to another without changing the directional cluster. However, we found that this perturbation design might not be fully developed: since the top two PCs are highly correlated with movement direction, such a move—similar to exchanging two states within the same cluster but under different target-motion conditions—would presumably not affect the behavior.

      Reviewer #3 (Public Review):

      Summary:

      This experimental study investigates the influence of sensory information on neural population activity in M1 during a delayed reaching task. In the experiment, monkeys are trained to perform a delayed interception reach task, in which the goal is to intercept a potentially moving target.

      This paradigm allows the authors to investigate how, given a fixed reach endpoint (which is assumed to correspond to a fixed motor output), the sensory information regarding the target motion is encoded in neural activity.

      At the level of single neurons, the authors found that target motion modulates the activity in three main ways: gain modulation (scaling of the neural activity depending on the target direction), shift (shift of the preferred direction of neurons tuned to reach direction), or addition (offset to the neural activity).

      At the level of the neural population, target motion information was largely encoded along the 3rd PC of the neural activity, leading to a tilt of the manifold along which reach direction was encoded that was proportional to the target speed. The tilt of the neural manifold was found to be largely driven by the variation of activity of the population of gain-modulated neurons.

      Finally, the authors studied the behaviour of an RNN trained to generate the correct hand velocity given the sensory input and reach direction. The RNN units were found to similarly exhibit mixed selectivity to the sensory information, and the geometry of the “ neural population” resembled that observed in the monkeys.

      Strengths:

      - The experiment is well set up to address the question of how sensory information that is directly relevant to the behaviour but does not lead to a direct change in behavioural output modulates motor cortical activity.

      - The finding that sensory information modulates the neural activity in M1 during motor preparation and execution is non trivial, given that this modulation of the activity must occur in the nullspace of the movement.

      - The paper gives a complete picture of the effect of the target motion on neural activity, by including analyses at the single neuron level as well as at the population level. Additionally, the authors link those two levels of representation by highlighting how gain modulation contributes to shaping the population representation.

      Thank you for your recognition.

      Weaknesses:

      - One of the main premises of the paper is the fact that the motor output for a given reach point is preserved across different target motions. However, as the authors briefly mention in the conclusion, they did not record muscle activity during the task, but only hand velocity, making it impossible to directly verify how preserved muscle patterns were across movements. While the authors highlight that they did not see any difference in their results when resampling the data to control for similar hand velocities across conditions, this seems like an important potential caveat of the paper whose implications should be discussed further or highlighted earlier in the paper.

      Thanks for the suggestion. We’ve highlighted the resampling results as an important control in the revised manuscript in Figure S11 and Lines 257-260 as:

      “To eliminate hand-speed effect, we resampled trials to construct a new dataset with similar distributions of hand speed in each target-motion condition and found similar orbital neural geometry. Moreover, the target-motion gain model provided a better explanation compared to the hand-speed gain model (Figure S11).”

      - The main takeaway of the RNN analysis is not fully clear. The authors find that an RNN trained given a sensory input representing a moving target displays modulation to target motion that resembles what is seen in real data. This is interesting, but the authors do not dissect why this representation arises, and how robust it is to various task design choices. For instance, it appears that the network should be able to solve the task using only the motion intention input, which contains the reach endpoint information. If the target motion input is not used for the task, it is not obvious why the RNN units would be modulated by this input (especially as this modulation must lie in the nullspace of the movement hand velocity if the velocity depends only on the reach endpoint). It would thus be important to see alternative models compared to true neural activity, in addition to the model currently included in the paper. Besides, for the model in the paper, it would therefore be interesting to study further how the details of the network setup (eg initial spectral radius of the connectivity, weight regularization, or using only the target position input) affect the modulation by the motion input, as well as the trained population geometry and the relative ratios of modulated cells after training.

      Great suggestions. In the revised manuscript, we’ve added the results of three alternative modes in Table S4 and Lines 355-365 as below:

      “We also tested three alternative network models: (1) only receives motor intention and a GO-signal; (2) only receives target location and a GO-signal; (3) initialized with sparse connection (sparsity=0.1); the unmentioned settings and training strategies were as the same as those for original models (Table S4; see Methods). The results showed that the three modulations could emerge in these models as well, but with obviously distinctive distributions. In (1), the ring-like structure became overlapped rings parallel to the PC1PC2 plane or barrel-like structure instead; in (2), the target-motion related tilting tendency of the neural states remained, but the projection of the neural states on the PC1-PC2 plane was distorted and the reach-direction clusters dispersed. These implies that both motor intention and target location seem to be needed for the proposed ring-like structure. The initialization of connection weights of the hidden layer can influence the network’s performance and neural state structure, even so, the ring-like structure”

      - Additionally, it is unclear what insights are gained from the perturbations to the network connectivity the authors perform, as it is generally expected that modulating the connectivity will degrade task performance and the geometry of the responses. If the authors wish the make claims about the role of the subpopulations, it could be interesting to test whether similar connectivity patterns develop in networks that are not initialized with an all-to-all random connectivity or to use ablation experiments to investigate whether the presence of multiple types of modulations confers any sort of robustness to the network.

      Thank you for these great suggestions. By perturbations, we intended to explore the contribution of interaction between certain subpopulations. We’ve included the ablation experiments in the updated manuscript in Table S3 and Lines 344-346 as below: “The ablation experiments showed that losing any kind of modulation nodes would largely deteriorate the performance, and those nodes merely with PD-shift modulation could mostly impact the neural state structure (Table S3).”

      - The results suggest that the observed changes in motor cortical activity with target velocity result from M1 activity receiving an input that encodes the velocity information. This also appears to be the assumption in the RNN model. However, even though the input shown to the animal during preparation is indeed a continuously moving target, it appears that the only relevant quantity to the actual movement is the final endpoint of the reach. While this would have to be a function of the target velocity, one could imagine that the computation of where the monkeys should reach might be performed upstream of the motor cortex, in which case the actual target velocity would become irrelevant to the final motor output. This makes the results of the paper very interesting, but it would be nice if the authors could discuss further when one might expect to see modulation by sensory information that does not directly affect motor output in M1, and where those inputs may come from. It may also be interesting to discuss how the findings relate to previous work that has found behaviourally irrelevant information is being filtered out from M1 (for instance, Russo et al, Neuron 2020 found that in monkeys performing a cycling task, context can be decoded from SMA but not from M1, and Wang et al, Nature Communications 2019 found that perceptual information could not be decoded from PMd)?

      How and where sensory information modulating M1 are very interesting and open questions. In the revised manuscript, we discuss these in Lines 435-446, as below: “It would be interesting to explore whether other motor areas also allow sensory modulation during flexible interception. The functional differences between M1 and other areas lead to uncertain speculations. Although M1 has pre-movement activity, it is more related to task variables and motor outputs. Recently, a cycling task sets a good example that the supplementary motor area (SMA) encodes context information and the entire movement (Russo et al., 2020), while M1 preferably relates to cycling velocity (Saxena et al., 2022). The dorsal premotor area (PMd) has been reported to capture potential action selection and task probability, while M1 not (Cisek and Kalaska, 2005; Glaser et al., 2018; Wang et al., 2019). If the neural dynamics of other frontal motor areas are revealed, we might be able to tell whether the orbital neural geometry of mixed selectivity is unique in M1, or it is just inherited from upstream areas like PMd. Either outcome would provide us some insights into understanding the interaction between M1 and other frontal motor areas in motor planning.”

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      At times the writing was a little hard to parse. It could benefit from being fleshed out a bit to link sentences together better.

      There are a few grammatical errors, such as:

      "These results support strong and similar roles of gain and additive nodes, but what is even more important is that the three modulations interact each other, so the PD-shift nodes should not be neglected."

      should be

      "These results support strong and similar roles of gain and additive nodes, but what is even more important is that the three modulations interact WITH each other, so the PDshift nodes should not be neglected."

      The discussion could also be more extensive to benefit non-experts in the field.

      Thank you. We have proofread and polished the updated manuscript.

      Reviewer #2 (Recommendations For The Authors):

      Other comments:

      - The authors mention mixed selectivity a few times, but Table 1 doesn't have a column for mixed selective neurons--this seems like an important oversight. Likewise, it would be good to see an example of a "mixed" neuron.

      - The structure of the writing in the results section often talked about the supplementary results before the main results - this seems backwards. If the supplementary results are important enough to come before the main figures, then they should not be supplementary. Otherwise, if the results are truly supplementary, they should come after the main results are discussed.

      - Line 305: Authors say "most" RNN units could be classified, and this is technically true, but only barely, according to Table 1. It might be good to put the actual percentage here in the text.

      - Figure 5a: typo ("Motion intention" rather than "Motor")

      - I couldn't find any mention of code or data availability in the manuscript.

      - There were a number of lines that didn't make much sense to me and should probably be rewritten or expanded on:

      - Lines 167-168: "These results qualitatively imply the interaction as that target speeds..." - Lines 178-179: "However, these neural trajectories were not yet the ideal description, because they were shaped mostly by time."

      - Lines 187-188: "...suggesting that target motion affects M1 neural dynamics via a topologically invariant transformation."

      - Lines 224-226: "Note that here we performed an linear transformation on all resulting neural state points to make the ellipse of the static condition orthogonal to the z-axis for better visualization." Does this mean that the z-axis is not PC 3 anymore?

      - Lines 272-274: "These simulations suggest that the existence of PD-shift and additive modulation would not disrupt the neural geometry that is primarily driven by gain modulation; rather it is possible that these three modulations support each other in a mixed population."

      Thank you for these detailed suggestions. By “mixed selectivity”, we mean the joint tuning of both target-motion and movement. In this case, the target-motion modulated neurons (regardless of the modulation type) are of mixed selectivity. The term “motor intention” refers to Mazzoni et al., 1996, Journal of Neurophysiology. We also revised the manuscript for better readership.

      We have updated the data and code availability in Data availability as below:

      “The example experimental datasets and relevant analysis code have been deposited in Mendeley Data at https://data.mendeley.com/datasets/8gngr6tphf. The RNN relevant code and example model datasets are available at https://github.com/yunchenyc/RNN_ringlike_structure.“

      Reviewer #3 (Recommendations For The Authors):

      Minor typos:

      Line 153: “there were”

      Line 301: “network was trained to generate”

      Line 318: “interact with each other”

      Suggested reformulations :

      Line 310 : “tilting angles followed a pattern similar to that seen in the data” Line 187 : the claim of a “topologically invariant transformation” seems strong as the analysis is quite qualitative.

      Suggested changes to the paper (aside from those mentioned in the main review): It could be nice to show behaviour in a main figure panel early on in the paper. This could help with the task description (as it would directly show how the trials are separated based on endpoint) and could allow for discussing the potential caveats of the assumption that behaviour is preserved.

      Thank you. We have corrected these typos and writing problems. As the similar task design has been reported, we finally decided not to provide extra figures or videos. Still, we thank this nice suggestion.

    1. eLife Assessment

      This valuable manuscript describes the immunogenicity of a bead-on-a-string immunogen that allows the inclusion of multiple HA subtypes. The evidence to support the claims is convincing, and more importantly, this approach could be adapted to other vaccine platforms.

    2. Reviewer #2 (Public review):

      Summary:

      The authors describe a "beads-on-a-string" (BOAS) immunogen, where they link, using a non-flexible glycine linker, up to eight distinct hemagglutinin (HA) head domains from circulating and non-circulating influenzas and assess their immunogenicity. They also display some of their immunogens on ferritin NP and compare the immunogenicity. They conclude that this new platform can be useful to elicit robust immune responses to multiple influenza subtypes using one immunogen and that it can also be used for other viral proteins.

      Strengths:

      The paper is clearly written. While the use of flexible linkers has been used many times, this particular approach (linking different HA subtypes in the same construct resembling adding beads on a string, as the authors describe their display platform) is novel and could be of interest.

      Comments on revisions:

      The authors have addressed most comments. Some mistakes/issues remain:

      TI should be defined earlier on line 61 not on line 196

      No legend for Figure 3E - it looks like this is where the authors did the first immunization with the "mix" to compare to the BOAs but strangely they do not mention this in the response to reviewers letter and only mention fig 6G and 7<br /> Maybe add "mix" to the title of Figure 3?

      In Figure 6G they do show the response to the mix but do not mention it in the immunizations for that figure. Also weird because obviously the mix is not a NP while this figure addresses NP format.

      Line 796 - pseudo viruses

      The authors should add some clarification in the paper as they did in response to reviewers.

    3. Reviewer #3 (Public review):

      This work describes the tandem linkage of influenza hemagglutinin (HA) receptor binding domains of diverse subtypes to create 'beads on a string' (BOAS) immunogens. They show that these immunogens elicit ELISA binding titers against full-length HA trimers in mice, as well as varying degrees of vaccine mismatched responses and neutralization titers. They also compare these to BOAS conjugated on ferritin nanoparticles and find that this did not largely improve immune responses. This work offers a new type of vaccine platform for influenza vaccines, and this could be useful for further studies on the effects of conformation and immunodominance on the resulting immune response. 

      Overall, the central claims of immunogenicity in a murine model of the BOAS immunogens described here are supported by the data. 

      Strengths included the adaptability of the approach to include several, diverse subtypes of HAs. The determination of an optimal composition of strains in the 5-BOAS that overall yielded the best immune responses was an interesting finding and one that could also be adapted to other vaccine platforms. Lastly, as the authors discuss, the ease of translation to an mRNA vaccine is indeed a strength of this platform. 

      One interesting and counter-intuitive result is the high levels of neutralization titers seen to vaccine-mismatched, group 2 H7 in the 5-BOAS group that differs from the 4-BOAS with the addition of a group 1 H5 RBD. At the same time, no H5 neutralization titers were observed for any of the BOAS immunogens, yet they were seen for the BOAS-NP. Uncovering where these immune responses are being directed and why these discrepancies are being observed would be informative future work. 

      There are a few caveats in the data that should be noted: 

      (1) 20 ug is a pretty high dose for a mouse and the majority of the serology presented is after 3 doses at 20 ug. By comparison, 0.5-5 ug is a more typical range (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6380945/https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9980174/). Also, the authors state that 20 ug per immunogen was used, including for the BOAS-NP group, which would mean that the BOAS-NP group was given a lower gram dose of HA RBD relative to the BOAS groups. 

      (2) Serum was pooled from all animals per group for neutralization assays, instead of testing individual animals. This could mean that a single animal with higher immune responses than the rest in the group could dominate the signal and potentially skew the interpretation of this data. 

      (3) In Figure S2, it looks like an apparent increase in MW by changing the order of strains here, which may be due to differences in glycosylation. Further analysis would be needed to determine if there are discrepancies in glycosylation amongst the BOAS immunogens and how those differ from native HAs. 

      Comments on revisions:

      The authors have addressed all concerns upon revision.

    4. Author response:

      The following is the authors’ response to the original reviews

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this manuscript by Thronlow Lamson et al., the authors develop a "beads-on-a-string" or BOAS strategy to link diverse hemagglutinin head domains, to elicit broadly protective antibody responses. The authors are able to generate varying formulations and lengths of the BOAS and immunization of mice shows induction of antibodies against a broad range of influenza subtypes. However, several major concerns are raised, including the stability of the BOAS, that only 3 mice were used for most immunization experiments, and that important controls and analyses related to how the BOAS alone, and not the inclusion of diverse heads, impacts humoral immunity.

      Strengths:

      Vaccine strategy is new and exciting.

      Analyses were performed to support conclusions and improve paper quality.

      Weaknesses:

      Controls for how different hemagglutinin heads impact immunity versus the multivalency of the BOAS.

      Only 3 mice were used for most experiments.

      There were limited details on size exclusion data.

      We appreciate the reviewer’s comments and have made the following changes to the manuscript.

      (1) We recognize that deconvoluting the effect of including a diverse set of HA heads and multivalency in the BOAS immunogens is necessary to understand the impact on antigenicity. Therefore, we now include a cocktail of the identical eight HA heads used in the 8-mer and BOAS nanoparticle (NP) as an additional control group. While we observed similar HA binding titers relative to the 8-mer and BOAS NP groups, the cocktail group-elicited sera was unable to neutralize any of the viruses tested; multivalency thus appears to be important for eliciting neutralizing responses

      (2) We increased the sample size by repeated immunizations with n=5 mice, for a total of n=8 mice across two independent experiments.

      (3) We expanded the details on size exclusion data to include:

      a) extended chromatograms from Figure 2C as Supplemental Figure 3.

      b) additional details in the materials and methods section (lines 370-372):

      “Recovered proteins were then purified on a Superdex 200 (S200) Increase 10/300 GL (for trimeric HAs) or Superose 6 Increase 10/300 GL (for BOAS) size-exclusion column in Dulbecco’s Phosphate Buffered Saline (DPBS) within 48 hours of cobalt resin elution.”

      Reviewer #2 (Public Review):

      Summary:

      The authors describe a "beads-on-a-string" (BOAS) immunogen, where they link, using a non-flexible glycine linker, up to eight distinct hemagglutinin (HA) head domains from circulating and non-circulating influenzas and assess their immunogenicity. They also display some of their immunogens on ferritin NP and compare the immunogenicity. They conclude that this new platform can be useful to elicit robust immune responses to multiple influenza subtypes using one immunogen and that it can also be used for other viral proteins.

      Strengths:

      The paper is clearly written. While the use of flexible linkers has been used many times, this particular approach (linking different HA subtypes in the same construct resembling adding beads on a string, as the authors describe their display platform) is novel and could be of interest.

      Weaknesses:

      The authors did not compare to individuals HA ionized as cocktails and did not compare to other mosaic NP published earlier. It is thus difficult to assess how their BOAS compare.<br /> Other weaknesses include the rationale as to why these subtypes were chosen and also an explanation of why there are different sizes of the HA1 construct (apart from expression). Have the authors tried other lengths? Have they expressed all of them as FL HA1?

      We appreciate the reviewer’s comments. We responded to the concerns below and modified the manuscript accordingly.

      (1) We recognize that including a “cocktail” control is important to understand how the multivalency present in a single immunogen affects the immune response. We now include an additional control group comprised of a mixture of the same eight HA heads used in the 8-mer and the BOAS nanoparticle (NP). While this cocktail elicited similar HA binding titers relative to the 8-mer and BOAS NP immunogens (Fig. 6G), there was no detectable neutralization any of the viruses tested (Fig. 7).

      (2) In the introduction we reference other multivalent display platforms but acknowledge that distinct differences in their immunogen design platforms make direct comparisons to ours difficult—which is ultimately why we did not use them as comparators for our in vivo studies. Perhaps most directly relevant to our BOAS platform is the mosaic HA NP from Kanekiyo et al. (PMID 30742080). Here, HA heads, with similar boundaries to ours, were selected from historical H1N1 strains. These NPs however were significantly less antigenic diverse relative to our BOAS NPs as they did not include any group 2 (e.g., H7, H9) or B influenza HAs; restricting their multivalent display to group 1 H1N1s likely was an important factor in how they were able to achieve broad, neutralizing H1N1 responses. Additionally, Cohen et al. (PMID 33661993) used similarly antigenically distinct HAs in their mosaic NP, though these included full-length HAs with the conserved stem region, which likely has a significant impact on the elicited cross-reactive responses observed. Lastly, we reference Hills et al. (PMID 38710880), where authors designed similar NPs with four tandemly-linked betacoronoavirus receptor binding domains (RBDs) to make “quartets”. In contrast to our observations, the authors observed increased binding and neutralization titers following conjugation to protein-based NPs. We acknowledge potential differences between the studies, such as the antigen and larger VLP NP, that could lead to the different observed outcomes.

      (3) We intended to highlight the “plug-and-play” nature of the BOAS platform; theoretically any HA subtype could be interchanged into the BOAS. To that end, our rationale for selecting the HA subtypes in our proof-of-principle immunogen was to include an antigenically diverse set of circulating and non-circulating HAs that we could ultimately characterize with previously published subtype-specific antibodies that were also conformation-specific. In doing so, these diagnostic antibodies could confirm presence and conformation integrity of each component. We intentionally did not include HA subtypes that we did not have a conformation-specific antibody for.

      The different sizes of HA head domains was determined exclusively by expression of the recombinant protein. We have not attempted expression of full-length HA1 domains. Furthermore, we have not attempted to express the full-length HA (inclusive of HA1 and HA2) in our BOAS platform. The primary reason was to avoid including the conserved stem region of HA2 which may distract from the HA1 epitopes (e.g., receptor binding site, lateral patch) that can be engaged by broadly neutralizing antibodies. Additionally, the full-length HA is inherently trimeric and may not be as amenable to our BOAS platform as the monomeric HA1 head domain.

      Reviewer #3 (Public Review):

      This work describes the tandem linkage of influenza hemagglutinin (HA) receptor binding domains of diverse subtypes to create 'beads on a string' (BOAS) immunogens. They show that these immunogens elicit ELISA binding titers against full-length HA trimers in mice, as well as varying degrees of vaccine mismatched responses and neutralization titers. They also compare these to BOAS conjugated on ferritin nanoparticles and find that this did not largely improve immune responses. This work offers a new type of vaccine platform for influenza vaccines, and this could be useful for further studies on the effects of conformation and immunodominance on the resulting immune response.

      Overall, the central claims of immunogenicity in a murine model of the BOAS immunogens described here are supported by the data.

      Strengths included the adaptability of the approach to include several, diverse subtypes of HAs. The determination of the optimal composition of strains in the 5-BOAS that overall yielded the best immune responses was an interesting finding and one that could also be adapted to other vaccine platforms. Lastly, as the authors discuss, the ease of translation to an mRNA vaccine is indeed a strength of this platform.

      One interesting and counter-intuitive result is the high levels of neutralization titers seen in vaccine-mismatched, group 2 H7 in the 5-BOAS group that differs from the 4-BOAS with the addition of a group 1 H5 RBD. At the same time, no H5 neutralization titers were observed for any of the BOAS immunogens, yet they were seen for the BOAS-NP. Uncovering where these immune responses are being directed and why these discrepancies are being observed would constitute informative future work.

      There are a few caveats in the data that should be noted:

      (1) 20 ug is a pretty high dose for a mouse and the majority of the serology presented is after 3 doses at 20 ug. By comparison, 0.5-5 ug is a more typical range (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6380945/, https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9980174/). Also, the authors state that 20 ug per immunogen was used, including for the BOAS-NP group, which would mean that the BOAS-NP group was given a lower gram dose of HA RBD relative to the BOAS groups.

      We agree that this is on the “upper end” of recombinant protein dose. While we did not do a dose-response, we now include serum analyses after a single prime. The overall trends and reactivity to matched and mis-matched BOAS components remained similar across days d28 and d42. However, the differences between the BOAS and BOAS NP groups and the mixture group were more pronounced at d28, which reinforces our observation that the multivalency of the HA heads is necessary for eliciting robust serum responses to each component. These data are included in Supplemental Figure 5, and we’ve modified the text (lines 185-187) to include;

      “Similar binding trends were also observed with d28 serum, though the difference between the 8mer and mix groups was more pronounced at d28 (Supplemental Figure 5).”

      Additionally, we acknowledge that there is a size discrepancy between the BOAS NP and the largest BOAS, leading to an approximately ~15-fold difference on a per mole basis of the BOAS immunogen. The smallest and largest BOAS also differ by ~ 2.5-fold on a per mole basis; this could favor the overall amount of the smaller immunogens, however because vaccine doses are typically calculated on a mg per kg basis, we did not calculate on a molar basis for this study. Any promising immunogens will be evaluated in dose-response study to optimize elicited responses.

      (2) Serum was pooled from all animals per group for neutralization assays, instead of testing individual animals. This could mean that a single animal with higher immune responses than the rest in the group could dominate the signal and potentially skew the interpretation of this data.

      We repeated the neutralization assays with data points for individual mice. There does appear to be variability in the immune response between mice. This is most noticeable for responses to the H5 component. We are currently assessing what properties of our BOAS immunogen might contribute to the variability across individual mice.

      (3) In Figure S2, it looks like an apparent increase in MW by changing the order of strains here, which may be due to differences in glycosylation. Further analysis would be needed to determine if there are discrepancies in glycosylation amongst the BOAS immunogens and how those differ from native HAs.

      There does appear to be a relatively small difference in MW between the two BOAS configurations shown in Figure S2. This could be due to differences in glycosylation, as the reviewer points out, and in future studies, we intend to assess the influence of native glycosylation on antibody responses elicited by our BOAS immunogens.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Major Concerns

      (1) From Figure 2D-E, it looks like BOAS are forming clusters, rather than a straight line. Do these form aggregates over time? Both at 4 degrees over a few days or after freeze-thaw cycle(s)? It is unclear from the SEC methods how long after purification this was performed and stability should be considered.

      Due to the inherent flexibility of the Gly-Ser linker between each component we do not anticipate that any rigidity would be imposed resulting in a “straight line”. Nevertheless, we appreciate the reviewers concern about the long-term stability of the BOAS immunogens. To address this, we include 1) the extended chromatograms from Figure 2C as Supplemental Figure 3 to show any aggregates present, 2) traces from up to 48 hours post-IMAC, and 3) chromatograms following a freeze-thaw cycle. Post-IMAC purification there is a minor (<10% total peak height) at ~9mL corresponding to aggregation. Note, we excluded this aggregation for immunizations. Post freeze-thaw cycle, we can see that upon immediate (<24hrs) thawing, the BOAS maintain a homogeneous peak with no significant (<10%) aggregation or degradation peak. However, after ~1 week post-freeze-thaw cycle at 4C, additional peaks within the chromatogram correspond to degradation of the BOAS.

      We modified the materials and methods section to state (lines 370-372)

      “Recovered proteins were then purified on a Superdex 200 (S200) Increase 10/300 GL (for trimeric HAs) or Superose 6 Increase 10/300 GL (for BOAS) size-exclusion column in Dulbecco’s Phosphate Buffered Saline (DPBS) within 48 hours of cobalt resin elution.”

      We commented on BOAS stability in the results section (lines 142-148)

      “Following SEC, affinity tags were removed with HRV-3C protease; cleaved tags, uncleaved BOAS, and His-tagged enzyme were removed using cobalt affinity resin and snap frozen in liquid nitrogen before immunizations. BOAS maintained monodispersity upon thawing, though over time, degradation was observed following longer term (>1 week) storage at 4C (Supplemental Figure 3). This degradation became more significant as BOAS increased in length (Supplemental Figure 3).”

      We also included in the discussion (lines 277-279):

      “Notably, for longer BOAS we observed degradation following longer term storage at 4C, which may reflect their overall stability.”

      (2) Figures 3-4 and 6-7, to make conclusions off of 3 mice per group is inappropriate. A sample size calculation should have been conducted and the appropriate number of mice tested. In addition, two independent mouse experiments should always be performed. Moreover, the reliability of the statistical tests performed seems unlikely, given the very small sample size.

      We agree that additional mice are necessary to make assessments regarding immunogenicity and cross-reactivity differences between the immunogens. To address this, we repeated the immunization with 5 additional mice, for a total of n=8 mice over two independent experiments. We incorporated these data into Figure 3B-D, as well as an additional Figure 3E (see below). We also now report the log-transformed endpoint titer (EPT) values rather than reciprocal EC50 values and added clarity to statistical analyses used. We have added the following lines to the methods section

      lines 427-431:

      “Serum endpoint titer (EPT) were determined using a non-linear regression (sigmoidal, four-parameter logistic (4PL) equation, where x is concentration) to determine the dilution at which dilution the blank-subtracted 450nm absorbance value intersect a 0.1 threshold. Serum titers for individual mice against respective antigens are reported as log transformed values of the EPT dilution.”

      lines 406-408:

      “C57BL/6 mice (Jackson Laboratory) (n=8 per group for 3-, 4-, 5-, 6-, 7-, and 8mer cohorts; n=5 for BOAS NP, NP, and mix cohorts) were immunized with 20µg of BOAS immunogens of varying length and adjuvanted with 50% Sigmas Adjuvant for a total of 100µL of inoculum.”

      lines 482-490:

      “Statistical Analysis

      Significance for ELISAs and microneutralization assays were determined using Prism (GraphPad Prism v10.2.3). ELISAs comparing serum reactivity and microneutralization and comparing >2 samples were analyzed using a Kruskal-Wallis test with Dunn’s post-hoc test to correct for multiple comparisons. Multiple comparisons were made between each possible combination or relative to a control group, where indicated. ELISAs comparing two samples were analyzed using a Mann-Whitney test. Significance was assigned with the following: * = p<0.05, ** = p<0.01, *** = p<0.001, and **** = p<0.0001. Where conditions are compared and no significance is reported, the difference was non-significant.”

      (3) One critical control that is missing is a homogenous BOAS, for example, just linking one H1 on a BOAS. Does oligomerization and increasing avidity alone improve humoral immunity?

      We agree that this is an interesting point, However, to address the impact of oligomerization and avidity on humoral immunity, we now include an additional control with a cocktail of HA heads used in the 8mer. We have incorporated this into Figure 3A, 3D and 3E, Figure 6G, and Figure 7.

      Additionally, we have added the following lines in the manuscript:

      lines 38-40:

      “Finally, vaccination with a mixture of the same HA head domains is not sufficient to elicit the same neutralization profile as the BOAS immunogens or nanoparticles.”

      lines 105-106:

      “Additionally, we showed that a mixture of the same HA head components was not sufficient to recapitulate the neutralizing responses elicited by the BOAS or BOAS NP.”

      lines 169-172:

      “To determine immunogenicity of each BOAS immunogen, we performed a prime-boost-boost vaccination regimen in C5BL/6 mice at two-week intervals with 20µg of immunogen and adjuvanted with Sigma Adjuvant (Figure 3A). We compared these BOAS to a control group immunized with a mixture of the eight HA heads present in the 8mer.”

      lines 265-267:

      “There were qualitatively immunodominant HAs, notably H4 and H9, and these were relatively consistent across BOAS in which they were a component. This effect was reduced in the mix cohort.”

      (4) While some cross-reactivity is likely (Figure 6G), there is considerable loss of binding when there is a mismatch. Of the antibodies induced, how much of this is strain-specific? For example, how well do serum antibodies bind to a pre-2009 H1?

      We agree with the reviewer that there is a considerable loss of binding when there is a mismatched HA component. To better understand this and incorporate a mismatched strain into our analysis of the 8mer and BOAS NP, we looked at serum binding titers to a pre-2009 H1, H1/Solomon Islands/2006, and an antigenically distinct H3, H3/Hong Kong/1968. We have incorporated this data into Figures 3D, 3E, 6F and 6G. We observed relatively high titers against both a mismatched H1 and H3, indicating that the BOAS maintain high titers against subtype-specific strains that are conserved over considerable antigenic distance. However, this was similar in the mixture group, indicating that this may not be specific to oligomerization of BOAS immunogens.

      We added the following to the methods section:

      lines 357-361

      “Head subdomains from these HAs were used in the BOAS immunogens, and full-length soluble ectodomain (FLsE) trimers were used in ELISAs. Additional H1 (H1/A/Solomon Islands/3/2006) and H3 (H3/A/Hong Kong/1/1968) FLsEs were used in ELISAs as mismatched, antigenically distinct HAs for all BOAS.”

      Minor Concerns

      (1) Line 44-46, the deaths per year are almost exclusively due to seasonal influenza outbreaks caused by antigenically drifted viruses in humans, not those spilling over from avian sp. and swine. For accuracy, please adjust this sentence.

      We have adjusted lines 45-48 to say “This is largely a consequence of viral evolution and antigenic drift as it circulates seasonally within humans and ultimately impacts vaccine effectiveness. Additionally, the chance for spillover events from animal reservoirs (e.g., avian, swine) is increasing as population and connectivity also increase.”

      (2) Figure 4D-E, provide a legend for what the symbols indicate, or simply just put the symbol next to either the homology score and % serum competition labels on the y-axis.

      We have included a legend in Figures 4D,E to distinguish between homology score and % serum competition

      (3) I am a bit confused by the data presented in Figure 7. The figure legend says the two symbols represent technical replicates. How? Is one technical replicate of all the mice in a group averaged and that's what's graphed? If so, this is not standard practice. I would encourage the authors to show the average technical replicates of each animal, which is standard.

      We thank the reviewer for their suggestion, and we have revised Figure 7 such that each symbol represents a single animal for n=5 animals. We have also adjusted the figure caption to the following:

      “Figure 7: Microneutralization titers to matched and mis-matched virus- Microneutralization of matched and mis-matched psuedoviruses: H1N1 (green, top left), H3N2 (orange, top right), H5N1 (yellow, bottom left), and H7N9 viruses (pink, bottom right) with d42 serum. Solid bars below each plot indicate a matched sub-type, and striped bars indicate a mis-matched subtype (i.e. not present in the BOAS). NP negative controls were used to determine threshold for neutralization. Upper and lower dashed lines represent the first dilution (1:32) (for H1N1, H3N2, and H5N1) or neutralization average with negative control NP serum (H7N9), and the last serum dilution (1:32,768), respectively, and points at the dashed lines indicate IC50s at or outside the limit of detection. Individual points indicate IC50 values from individual mice from each cohort (n=5). The mean is denoted by a bar and error bars are +/- 1 s.d., * = p<0.05 as determined by a Kruskal-Wallis test with Dunn’s multiple comparison post hoc test relative to the mix group.”

      (4) Paragraphs 298-313, multiple studies are referred to but not referenced.

      We have added the following references to this section:

      (38) Kanekiyo, M. et al. Self-assembling influenza nanoparticle vaccines elicit broadly neutralizing H1N1 antibodies. Nature 498, 102–106 (2013).

      (48) Hills, R. A. et al. Proactive vaccination using multiviral Quartet Nanocages to elicit broad anti-coronavirus responses. Nat. Nanotechnol. 1–8 (2024) doi:10.1038/s41565-024-01655-9.

      (65) Jardine, J. et al. Rational HIV immunogen design to target specific germline B cell receptors. Science 340, 711–716 (2013).

      (66) Tokatlian, T. et al. Innate immune recognition of glycans targets HIV nanoparticle immunogens to germinal centers. Science 363, 649–654 (2019).

      (67) Kato, Y. et al. Multifaceted Effects of Antigen Valency on B Cell Response Composition and Differentiation In Vivo. Immunity 53, 548-563.e8 (2020).

      (68) Marcandalli, J. et al. Induction of Potent Neutralizing Antibody Responses by a Designed Protein Nanoparticle Vaccine for Respiratory Syncytial Virus. Cell 176, 1420-1431.e17 (2019).

      (69) Bruun, T. U. J., Andersson, A.-M. C., Draper, S. J. & Howarth, M. Engineering a Rugged Nanoscaffold To Enhance Plug-and-Display Vaccination. ACS Nano 12, 8855–8866 (2018).

      (70) Kraft, J. C. et al. Antigen- and scaffold-specific antibody responses to protein nanoparticle immunogens. Cell Reports Medicine 100780 (2022) doi:10.1016/j.xcrm.2022.100780.

      Reviewer #2 (Recommendations For The Authors):

      Can the authors define "detectable titers"?

      Maybe add a threshold value of reciprocal EC on the figure for each plot.

      We recognize the reviewers concern with reporting serum titers in this way, and we have adjusted our reported titers as endpoint titers (EPT) with a dotted line for the first detectable dilution (1:50). We have also adjusted the methods section to reflect this change:

      (lines 427-431)

      “Serum endpoint titer (EPT) were determined using a non-linear regression (sigmoidal, four-parameter logistic (4PL) equation, where x is concentration) to determine the dilution at which dilution the blank-subtracted 450nm absorbance value intersect a 0.1 threshold. Serum titers for individual mice against respective antigens are reported as log transformed values of the EPT dilution.”

      It also appears that not all X-mer elicits an immune response against matched HA, e.g. for the 7 and 8 -mer. Not sure why the authors do not mention this. It could be due to too many HAs, not sure.

      We apologize for the confusion, and agree that our original method of reporting EC50 values does not reflect weak but present binding titers. Upon further analysis with additional mice as well as adjusting our method of reporting titers, it is easier to see in Figure 3D that all X-mer BOAS do indeed elicit binding detectable titers to matched HA components.

      It will be nice to add a conclusion to the cross-reactivity - again it appears that past 6-mer there has been a loss in cross-reactivity even though there are more subtypes on the BOAS.

      Also, the TI seemed to be the more conserved epitope targeted here.

      (Of note these two are mentioned in the discussion)

      We have updated the results section to include the following:

      (lines 281-294)

      “Based on the immunogenicity of the various BOAS and their ability to elicit neutralizing responses, it may not be necessary to maximize the number of HA heads into a single immunogen. Indeed, it qualitatively appears that the intermediate 4-, 5-, and 6mer BOAS were the most immunogenic and this length may be sufficient to effectively engage and crosslink BCR for potent stimulation. These BOAS also had similar or improved binding cross-reactivity to mis-matched HAs as compared to longer 7- or 8mer BOAS. Notably, the 3mer BOAS elicited detectable cross-reactive binding titers to H4 and H5 mismatched HAs in all mice. This observed cross-reactivity could be due to sequence conservation between the HAs, as H3 and H4 share ~51% sequence identity, and H1 and H2 share ~46% and ~62% overall sequence identity with H5, respectively (Supplemental Figure 6). Additionally, the degree of surface conservation decreased considerably beyond the 5mer as more antigenically distinct HAs were added to the BOAS. These data suggest that both antigenic distance between HA components and BOAS length play a key role in eliciting cross-reactive antibody responses, and further studies are necessary to optimize BOAS valency and antigenic distance for a desired response.”

      Figure 5E, the authors could indicate which subtype each mab is specific to for those who are not HA experts. (They have them color-coded but it is hard to see because very small).

      The authors also do not explain why 3E5 does not bind well to H1, H2, H3, H4 4-mer BOA, etc...

      We apologize for the lack of clarity in this figure. We updated Figure 5E to include the subtype it is specific for as well as listing the antibodies and their subtype and targeted epitope in the figure caption.

      Minor

      Figure 1B zoom looks like the line is hidden to the structure - should come in front

      We adjusted the figure accordingly.

      Line 127 - whether the order

      Corrected

      What is the rationale for thinking that a different order will lead to a different expression and antigenic results?

      We thank the reviewer for this question. We did not necessarily anticipate a difference in protein expression based on BOAS order We, however, wanted to verify that our platform was indeed “plug-and-play” platform and we could readily exchange components and order. We do, however, hypothesize that a different order may in fact lead to different antigenic results. We think that the conformation of the BOAS as well as physical and antigenic distance of HA components may influence cross-linking efficiency of BCRs and lead to different antigenic results with different levels of cross-reactivity. For example, a BOAS design with a cluster of group 1 HAs followed by a cluster of group 2 HAs, rather than our roughly alternating pattern could impact which HAs are in proximity to each other or could be potentially shielded in certain conformations, and thus could affect antigenic results. We expand on this rationale in the discussion in lines 310-314:

      “Further studies with different combinations of HAs could aid in understanding how length and composition influences epitope focusing. For example, a BOAS design with a cluster of group 1 HAs followed by a cluster of group 2 HAs, rather than our roughly alternating pattern could impact which HAs are in close proximity to one other or could be potentially shielded in certain conformations, and thus could affect antigenic results.”

      Maybe list HA#1 HA#2 HA#3 instead of HA1, HA2, HA3 to make sure it is not confounded with HA2 and HA2

      We agree that this may be confusing for readers, and have adjusted Figure 1C to show HA#1, HA#2, etc.

      For nsEM, do the authors have 2D classes and even 3D reconstructions? Line 148-149: maybe or just because there are more HAs.

      We did not obtain 2D class or 3D reconstructions of these BOAS. However, we do agree with the reviewer that the collapsed/rosette structure of the 8mer BOAS may be a consequence of the additional HA heads as well as the flexible Gly-Ser linkers between the components. We have added clarify to our statement in the discussion to read:

      lines 154-156:

      “This is likely a consequence of the flexible GSS linker separating the individual HA head components as well as the addition of significantly more HA head components to the construct.”.

      Line 153 " interface-directed" - what does this mean?

      We apologize for any confusion- we intend for “interface-directed” to refer antibodies that engage the trimer interface (TI) epitope between HA protomers. We have adjusted the manuscript to use the same terminology throughout, i.e. trimer interface or its abbreviation, TI.

      For Figure 2 F - do you have a negative control? Usually one does not determine an ELISA KD, it is not very accurate but shows binding in terms of OD value.

      We did include a negative control, MEDI8852, a stem-directed antibody, though it was not shown in the figure because we observed no binding, as expected. This negative control antibody was also used in Figure 5E for characterizing the BOAS NPs, and also shows no binding. We recognize that in an ELISA the KD is an equilibrium measurement and we do not report kinetic measurements as determined by a method such as bio-layer interferometry (BLI), and have this adjusted the figure caption to denote the values as “apparent K<sub>D</sub> values”.

      Line 169 - reads strangely, "BOAS-elicited serum, regardless of its length, reacted<br /> The length is the one of the Immunogen, not the serum

      We agree that this statement is unclear, and we have modified the sentence to read:

      lines 177-178:

      “Each of the BOAS, regardless of its length, elicited binding titers to all matched full-length HAs representing individual components (Figure 3D).”

      What is the adjuvant used (add in results)?

      We used Sigma adjuvant for all immunizations, and have included this information in the results section:

      lines 169-171:

      “To determine immunogenicity of each BOAS, we performed a prime-boost-boost vaccination regimen in C5BL/6 mice at two-week intervals with 20µg of immunogen and adjuvanted with Sigma Adjuvant (Figure 3A).”

      This information is also included in the methods section in lines 406-412.

      Line 178 - remove " across"

      We have removed the word “across” in this sentence and replaced it with “on” (line 194)

      Trimer- interface, and interface epitopes are used exchangeably - maybe keep it as trimer interface to be more precise

      As stated above, we have adjusted the manuscript to use the same term throughout, i.e., trimer interface or its abbreviation, TI.

      Line 221 - no figure 6H (6G?)

      We apologize for this typo and have corrected to Figure 6G (line 231)

      Reviewer #3 (Recommendations For The Authors):

      (1) Since 20 ug x3 doses is quite a high amount of vaccine, differences between immunogens may become blurred. Thus, it may be informative to compare post-prime serology for all immunogens or select immunogens to compare to the post-3rd dose data.

      We agree with the reviewer that this is on the upper end of vaccine dose and thus we explored the serum responses after a single boost. The overall trends and reactivity to matched and mis-matched BOAS components remained similar across days d28 and d42. However, the differences between the BOAS and BOAS NP groups and the mixture group were more pronounced at d28, which bolsters our claim that the presentation of the HA heads is important for eliciting strong serum responses to all components. We have included this data in Supplemental Figure 5, and have acknowledged this in the text:

      lines 185-187:

      “Similar binding trends were also observed with d28 serum, though the difference between the 8mer and mix groups was more pronounced at d28 (Supplemental Figure 5).”

      (2) Significance statistics for all immunogenicity data should be added and discussed; it is particularly absent in Figures 3D and 7.

      We have added statistical analyses to Figure 3 and Figure 7 to reflect changes in immunogenicity. We have also added the following to the methods section:

      lines 482-490:

      “Statistical Analysis

      Significance for ELISAs and microneutralization assays were determined using either a Mann-Whitney test or a Kruskal-Wallis test with Dunn’s post-hoc test in Prism (GraphPad Prism v10.2.3) to correct for multiple comparisons. Multiple comparisons were made between each possible combination or relative to a control group, where indicated. Significance was assigned with the following: * = p<0.05, ** = p<0.01, *** = p<0.001, and **** = p<0.0001. Where conditions are compared and no significance is reported, the difference was non-significant.”

      (3) Figure 2F: the figure has K03.12 listed for the H3-specific mAb and in the main text, but the caption says 3E5 - is the 3E5 in the caption a typo? 3E5 is listed for the competition ELISAs as an RBS mAb, but its binding site is distal to the RBS at residues 165-170 (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9787348/), H7.167 binds in the RBS periphery and not directly within the RBS, and the epitope for P2-D9 is undetermined/not presented. This could mean that there is actually a higher proportion of RBS-directed antibodies than what is determined from this serum competition data. Also, reference to these as 'RBS-directed' in the serum competition methods section should be revised for accuracy.

      We sincerely apologize for this error and the resulting confusion. 3E5 in the caption is incorrect and should be K03.12 (https://www.rcsb.org/structure/5W08) and does engage the receptor binding site. We also apologize for the oversight that H7.167 is in the RBS periphery and not directly in the RBS. The additional P2-D9 in the panel of RBS-directed antibodies was also in error, as we do not believe it is RBS-directed, but is indeed H4 specific. We also included a reference to the paper and immunogen that elicited this antibody. We agree that this indicates that there could be a higher proportion of RBS-directed antibodies in the serum and have modified the text in the results and methods sections to read:

      lines 300-306:

      “Notably, this proportion is approximate, as at the time of reporting, antibodies that bind the receptor binding site of all components were not available. RBS-directed antibodies to the H4 and H9 component were not available, and the RBS-directed antibodies used targeting the other HA components have different footprints around the periphery of the RBS. Additionally, there are currently no reported influenza B TI-directed antibodies in the literature. Therefore, this may be an underestimate of the serum proportion focused to the conserved RBS and TI epitopes.”

      lines 435-439:

      “Following blocking with BSA in PBS-T, blocking solution was discarded and 40µL of either DPBS (no competition control), a cocktail of humanized antibodies targeting the RBS and periphery (5J8, 2G1, K03.12, H5.3, H7.167, H1209), a cocktail of humanized TI-directed antibodies (S5V2-29, D1 H1-17/H3-14, D2 H1-1/H3-1), or a negative control antibody (MEDI8852) were added at a concentration of 100µg/mL per antibody.”

      (4) Only nsEM data is shown for the 3-BOAS and 8-BOAS, where differences in morphology were seen between these longer and shorter proteins. Including nsEM images for all BOAS immunogens may show trends in morphology or organization that could correlate with immune responses, e.g. if the 5-BOAS also forms a higher proportion of rosette-like structures, while the the 4-BOAS is still a mix between extended and rosette-like, this could be a factor in the better immune responses seen for 5-BOAS.

      We appreciate the reviewer’s suggestion for further analysis of morphology between the intermediate BOAS sizes. We agree that the relationship between BOAS length and morphology should be explored more in depth, and we intend to do so in future studies and to also vary linker length and rigidity.

    1. eLife Assessment

      This important study introduces a novel split-belt treadmill learning task to reveal distinct and parallel learning sub-components of gait adaptation: slow and gradual error-based perceptual realignment, and a more deliberate and flexible "stimulus-response" style learning process. The behavioural results convincingly support the presence of a non-error-based learning process during continuous movements, and the computational modelling provides comprehensive further evidence for establishing this learning process. These results will be of interest for the broader motor learning community.

    2. Reviewer #1 (Public review):

      Summary:

      Rossi et al. asked whether gait adaptation is solely a matter of slow perceptual realignment or if it also involves fast/flexible stimulus-response mapping mechanisms. To test this, they conducted a series of split-belt treadmill experiments with ramped perturbations, revealing behavior indicative of a flexible, automatic stimulus-response mapping mechanism.

      Strengths:

      (1) The study includes a perceptual test of leg speed, which correlates with the perceptual realignment component of motor aftereffects. This indicates that changes in motor performance are not fully accounted for by perceptual realignment.

      (2) The study evaluates the possible contributions of explicit strategy using a framework (Tsay et al., 2024) and provides evidence for minimal strategy involvement in split-belt adaptation through subjective reports.

      (3) The study incorporates qualitatively distinct, hypothesis-driven models of adaptation and proposes a new framework that integrates these mechanisms. Relatedly, the study considers a range of alternative models, demonstrating that perceptual recalibration and remapping uniquely explain the patterns of behavior and aftereffects, ruling out models that focus solely on a single process (e.g., PReMo, PEA, memory of errors, optimal feedback control) and others that do not incorporate remapping (dual rate state space models).

    3. Reviewer #2 (Public review):

      Recent findings in the field of motor learning have pointed to the combined action of multiple mechanisms that potentially contribute to changes in motor output during adaptation. A nearly ubiquitous motor learning process occurs via the trial-by-trial compensation of motor errors, often attributed to cerebellar-dependent updating. This error-based learning process is slow and largely unconscious. Additional learning processes that are rapid (e.g., explicit strategy-based compensation) have been described in discrete movements like goal-directed reaching adaptation. However, the role of rapid motor updating during continuous movements such as walking has been either under explored or inconsistent with those found during adaptation of discrete movements. Indeed, previous results have largely discounted the role of explicit strategy-based mechanisms for locomotor learning. In the current manuscript, Rossi et al. provide convincing evidence for a previously unknown rapid updating mechanism for locomotor adaptation. Unlike the now well-studied explicit strategies employed during reaching movements, the authors demonstrate that this stimulus-response mapping process is largely unconscious. The authors show that in approximately half of subjects, the mapping process appears to be memory based while the remainder of subjects appear to perform structural learning of the task design. The participants that learned using a structural approach had the capability to rapidly generalize to previously unexplored regions of the perturbation space.

      One result that will likely be particularly important to the field of motor learning is the authors' quite convincing correlation between the magnitude of proprioceptive recalibration and the magnitude error-based updating. This result beautifully parallels results in other motor learning tasks and appears to provide a robust marker for the magnitude of the mapping process (by means of subtracting off the contribution of error-based motor learning). This is a fascinating result with implications for the motor learning field well beyond the current study.

      A major strength of this manuscript is the large sample size across experiments and the extent of replication performed by the authors in multiple control experiments.

      Finally, I commend the authors on extending their original observations via Experiment 2. While it seems that participants use a range of mapping mechanisms (or indeed a combination of multiple mapping mechanisms), future experiments may be able to tease apart why some subjects use memory versus structural mapping. A future ability to push subjects to learn structurally-based mapping rules has the potential to inform rehabilitation strategies.

      Overall, the manuscript is well written, the results are clear, and the data and analyses are convincing.

      Strengths:

      (1) Convincing behavioral data supporting the existence of multiple learning processes during split-belt adaptation. Further convincing correlations typing the extent of forward-model based adaptation with proprioceptive recalibration.<br /> (2) The authors test a veritable "zoo" of prior motor learning models to show that these models do not account for their behavioral results.<br /> (3) The authors develop a convincing alternative model (PM-ReMap) that appears to account for their behavioral results by explicitly modeling forward-model based adaptation in parallel with goal remapping.

    4. Reviewer #3 (Public review):

      Summary:

      In this work, Rossi et al. use a novel split-belt treadmill learning task to reveal distinct sub-components of gait adaptation. The task involved following a standard adaptation phase with a "ramp-down" phase that helped them dissociate implicit recalibration and more deliberate SR map learning. Combined with modeling and re-analysis of previous studies, the authors show multiple lines of evidence that both processes run simultaneously, with implicit learning saturating based on intrinsic learning constraints and SR learning showing sensitivity to a "perceptual" error. These results offer a parallel with work in reaching adaptation showing both explicit and implicit processes contributing to behavior; however, in the case of gait adaptation the deliberate learning component does not appear to be strategic but is instead a more implicit SR learning process.

      The authors have done a commendable job responding to my comments and critiques. I have updated the S/W below to reflect that.

      Strengths:

      - The task design is very clever and the "ramp down" phase offers a novel way to attempt to dissociate competing models of multiple processes in gait adaptation<br /> - The analyses are thorough, as is the re-analysis of multiple previous data sets; the expanded modeling analyses are strong<br /> - The querying of perception of the different relative belt speeds is a very nice addition, allowing the authors to connect different learning components with error perception<br /> - The conceptual framework is compelling, highlighting parallels with work in reaching but also emphasizing differences, especially w/r/t SR learning versus strategic behaviors. Thus the discovery of an SR learning process in gait adaptation would be both novel and also help conjoin different siloed subfields of motor learning research.

      Weaknesses:

      - The expanded modeling analyses are useful although the SR process still seems somewhat mysterious (is it explicit/implicit? how exactly is it interacting with re-calibration?); however, understanding this system more could be a fruitful topic for future work<br /> - The sample size for the individual difference analysis is somewhat modest

    1. eLife Assessment

      This study presents a useful reassessment of the potential role of dendritic cell-derived IL-27 p28 cytokine in the functional maturation of CD4+CD8- thymocytes, and CD4+ recent thymic emigrants. The evidence supporting the claims of the authors is solid and serves to reaffirm what has been previously described, with the overall advance in understanding the mechanism(s) responsible for the intrathymic functional programming of CD4+ T cells being limited.

    2. Reviewer #1 (Public Review):

      Summary:

      Zhang et al. demonstrate that CD4+ single positive (SP) thymocytes, CD4+ recent thymic emigrants (RTE), and CD4+ T naive (Tn) cells from Cd11c-p28-flox mice, which lack IL-27p28 selectively in Cd11c+ cells, exhibit a hyper-Th1 phenotype instead of the expected hyper Th2 phenotype. Using IL-27R-deficient mice, the authors confirm that this hyper-Th1 phenotype is due to IL-27 signaling via IL-27R, rather than the effects of monomeric IL-27p28. They also crossed Cd11c-p28-flox mice with autoimmune-prone Aire-deficient mice and showed that both T cell responses and tissue pathology are enhanced, suggesting that SP, RTE, and Tn cells from Cd11c-p28-flox mice are poised to become Th1 cells in response to self-antigens. Regarding mechanism, the authors demonstrate that SP, RTE, and Tn cells from Cd11c-p28-flox mice have reduced DNA methylation at the IFN-g and Tbx21 loci, indicating 'de-repression', along with enhanced histone tri-methylation at H3K4, indicating a 'permissive' transcriptional state. They also find evidence for enhanced STAT1 activity, which is relevant given the well-established role of STAT1 in promoting Th1 responses, and surprising given IL-27 is a potent STAT1 activator. This latter finding suggests that the Th1-inhibiting property of thymic IL-27 may not be due to direct effects on the T cells themselves.

      Strengths:

      Overall the data presented are high quality and the manuscript is well-reasoned and composed. The basic finding - that thymic IL-27 production limits the Th1 potential of SP, RTE, and Tn cells - is both unexpected and well described.

      Weaknesses from the original round of review:

      A credible mechanistic explanation, cellular or molecular, is lacking. The authors convincingly affirm the hyper-Th1 phenotype at epigenetic level but it remains unclear whether the observed changes reflect the capacity of IL-27 to directly elicit epigenetic remodeling in developing thymocytes or knock-on effects from other cell types which, in turn, elicit the epigenetic changes (presumably via cytokines). The authors propose that increased STAT1 activity is a driving force for the epigenetic changes and resultant hyper-Th1 phenotype. That conclusion is logical given the data at hand but the alternative hypothesis - that the hyper-STAT1 response is just a downstream consequence of the hyper-Th1 phenotype - remains equally likely. Thus, while the discovery of a new anti-inflammatory function for IL-27 within the thymus is compelling, further mechanistic studies are needed to advance the finding beyond phenomenology.

    3. Reviewer #2 (Public Review):

      Summary:

      Naïve CD4 T cells in CD11c-Cre p28-floxed mice express highly elevated levels of proinflammatory IFNg and the transcription factor T-bet. This phenotype turned out to be imposed by thymic dendritic cells (DCs) during CD4SP T cell development in the thymus [PMID: 23175475]. The current study affirms these observations, first, by developmentally mapping the IFNg dysregulation to newly generated thymic CD4SP cells [PMID: 23175475], second, by demonstrating increased STAT1 activation being associated with increased T-bet expression in CD11c-Cre p28-floxed CD4 T cells [PMID: 36109504], and lastly, by confirming IL-27 as the key cytokine in this process [PMID: 27469302]. The authors further demonstrate that such dysregulated cytokine expression is specific to the Th1 cytokine IFNg, without affecting the expression of the Th2 cytokine IL-4, thus proposing a role for thymic DC-derived p28 in shaping the cytokine response of newly generated CD4 helper T cells. Mechanistically, CD4SP cells of CD11c-Cre p28-floxed mice were found to display epigenetic changes in the Ifng and Tbx21 gene loci that were consistent with increased transcriptional activities of IFNg and T-bet mRNA expression. Moreover, in autoimmune Aire-deficiency settings, CD11c-Cre p28-floxed CD4 T cells still expressed significantly increased amounts of IFNg, exacerbating the autoimmune response and disease severity. Based on these results, the investigators propose a model where thymic DC-derived IL-27 is necessary to suppress IFNg expression by CD4SP cells and thus would impose a Th2-skewed predisposition of newly generated CD4 T cells in the thymus, potentially relevant in autoimmunity.

      Strengths:

      Experiments are well-designed and executed. The conclusions are convincing and supported by the experimental results.

      Weaknesses from the original round of review:

      The premise of the current study is confusing as it tries to use the CD11c-p28 floxed mouse model to explain the Th2-prone immune profile of newly generated CD4SP thymocytes. Instead, it would be more helpful to (1) give full credit to the original study which already described the proinflammatory IFNg+ phenotype of CD4 T cells in CD11c-p28 floxed mice to be mediated by thymic dendritic cells [PMID: 23175475], and then, (2) build on that to explain that this study is aimed to understand the molecular basis of the original finding.

      In its essence, this study mostly rediscovers and reaffirms previously reported findings, but with different tools. While the mapping of epigenetic changes in the IFNg and T-bet gene loci and the STAT1 gene signature in CD4SP cells are interesting, these are expected results, and they only reaffirm what would be assumed from the literature. Thus, there is only incremental gain in new insights and information on the role of DC-derived IL-27 in driving the Th1 phenotype of CD4SP cells in CD11c-p28 floxed mice.

      Altogether, the major issues of this study remain unresolved:

      (1) It is still unclear why the p28-deficiency in thymic dendritic cells would result in increased STAT1 activation in CD4SP cells. Based on their in vitro experiments with blocking anti-IFNg antibodies, the authors conclude that it is unlikely that the constitutive activation of STAT1 would be a secondary effect due to autocrine IFNg production by CD4SP cells. However, this possibility should be further tested with in vivo models, such as Ifng-deficient CD11c-p28 floxed mice. Alternatively, is this an indirect effect by other IFNg producers in the thymus, such as iNKT cells? It is necessary to explain what drives the STAT1 activation in CD11c-p28 floxed CD4SP cells in the first place.

      (2) It is also unclear whether CD4SP cells are the direct targets of IL-27 p28. The cell-intrinsic effects of IL-27 p28 signaling in CD4SP cells should be assessed and demonstrated, ideally by CD4SP-specific deletion of IL-27Ra, or by establishing bone marrow chimeras of IL-27Ra germline KO mice.

      [Editors' note: The resubmitted paper was minimally revised, and many of the initial concerns remain unresolved.]

    1. eLife Assessment

      This paper present an important theoretical exploration of how a flexible protein domain with multiple DNA binding sites may simultaneously provide stability to the DNA-bound state and enables exploration of the DNA strand. The authors present compelling evidence that their findings have implications for the way intrinsically disordered regions (IDR) of transcription factors proteins (TF) can enhance their ability to efficiently find their binding site on the DNA from which they exert control over the transcription of their target gene. The paper concludes with a comparison of model predictions with experimental data which gives further support to the proposed model.

    2. Reviewer #1 (Public review):

      Summary:

      The authors define the principles that, based on first principles, should be guiding the optimisation of trascription factors with intrinsically disordered regions (IDR). The first part of the study defines the following principles to optimize the binding affinities to the genome in the receiving region that is called the "antenna": (i) reduce the target to IDR-binding distance on the genome, (ii) optimise the distance betwee the DNA binding domain and the binding sites on the IDR to be as close as possible to the distance between their binding sites on the genome; (iii) keep the same number of binding sites and their targets and modulate this number with binding strength, reducing them with increased strenght; (iv) modulate the binding strenght to be above a threshold that depends on the proportion of IDR binding sites in the antenna. The second part defines the scaling of the seach time in function of key parameters such as the volume of the nucleus, and the size of the antenna, derived as a combination of 3D search of the antenna and 1D "octopusing" on the antenna. The third part focuses on validation, where the current results are compared to binding probabilith data from a single experiment, and new experiment are proposed to further validate the model as well as testing designed transcription factors.

      Strengths:

      The strength of this work is that it provides simple, interpretable and testable theoretical conclusions. This will allow the derived design principles to be understood, evaluated and improved in the future. The theoretical derivations are rigorous. The authors provides a comparison to experiments, and also propose new experiments to be performed in the future, this is a great value in the paper since it will set the stage and inspire new experimental techniques. Further, the field needs inspiration and motivations to develop these techniques, since they are required to benchmark the transcription factors designed with the methods presented in this paper, as well as to develop novel data based or in vivo methods that would greatly benefit the field. As such, this paper is a fundamental contribution to the field.

      Weaknesses:

      The model assumption that the interaction between the transcription factor and the DNA outside of the antenna region is negligible is probably too strong for many/most transcription factors, particularly in organisms with a longer genome than yeasts. The model presents many first principles to drive the design of transcription factor, but arguably, other principles and mechanisms might also play a role by being beneficial to the search and binding process. Specifically: (i) a role of the IDR in complex formation and cooperativity between multiple trascription factors, (ii) ability of the IDR to do parallel searching based on multiple DNA binding sites spaced by disordered regions, (iii) affinity of the IDR to specific compartmentalisations in the nucleus reducing the search time, etc. The paper would be improved by a discussion over alternative mechanisms.

    3. Reviewer #2 (Public review):

      Summary:

      This is an interesting theoretical exploration of how a flexible protein domain, which has multiple DNA-binding sites along it, affects the stability of the protein-DNA complex. It proposes a mechanism ("octopusing") for protein doing a random walk while bound to DNA which simultaneously enables exploration of the DNA strand and stability of the bound state.

      Strengths:

      Stability of the protein-DNA bound state and the ability of the protein to perform 1d diffusion along the DNA are two properties of a transcription factor that are usually seen as being in opposition of each other. The octopusing mechanism is an elegant resolution of the puzzle of how both could be accommodated. This mechanism has interesting biological implications for the functional role of intrinsically disordered domains in transcription factor (TF) proteins. They show theoretically how these domains, if flexible and able to make multiple weak contacts with the DNA, can enhance the ability of the TF to efficiently find their binding site on the DNA from which they exert control over the transcription of their target gene. The paper concludes with a comparison of model predictions with experimental data which gives further support to the proposed model. Overall, this is an interesting and well executed theoretical paper that proposes an interesting idea about the functional role for IDR domains in TFs.

      Weaknesses:

      IDR domains are assumed flexible which I believe is not always the case. Also, I'm not sure how ubiquitous are the assumed binding sites on the DNA for multiple subdomains along the IDR. These assumptions though seem like interesting points of departure for further experiments.

    1. eLife Assessment

      This manuscript applies state-of-the-art techniques to define the cellular composition of the dorsal vagal complex in two rodent species (mice and rats). The result is an important resource that substantially advances our understanding of the dorsal vagal complex's role in the regulation of feeding and metabolism while also highlighting key differences between species. While most of the analyses in the manuscript provide convincing insight into the cellular architecture of the dorsal vagal complex, other aspects are incomplete and could be bolstered by additional evidence.

    2. Reviewer #1 (Public review):

      Summary:

      This paper uses state-of-the-art techniques to define the cellular composition and its complexity in two rodent species (mice and rats). The study is built on available datasets but extends those in a way that future research will be facilitated. The study will be of high impact for the study of metabolic control.

      Strengths:

      (1) The study is based on experiments that are combined with two exceptional data sets to provide compelling evidence for the cellular composition of the DVC.

      (2) The use of two rodent species is very useful.

      Weaknesses:<br /> There is no conceptual weakness, the performance of experiments is state-of-the-art, and the discussion of results is appropriate. One minor point that would further strengthen the data is a more distinct analysis of receptors that are characteristic of the different populations of neuronal and non-neuronal cells; this part could be improved. Currently, it is only briefly mentioned, e.g., line 585ff. See also lines 603ff; it is true that the previous studies lack some information about the neurotransmitter profile of cells, but combining all data sets should result in an analysis of the receptors as well, e.g. in the form of an easy-to-read table.

    3. Reviewer #2 (Public review):

      In this manuscript, Hes et al. present a comprehensive multi-species atlas of the dorsal vagal complex (DVC) using single-nucleus RNA sequencing, identifying over 180,000 cells and 123 cell types across five levels of granularity in mice and rats. Intriguingly, the analysis uncovered previously uncharacterized cell populations, including Kcnj3-expressing astrocytes, neurons co-expressing Th and Cck, and a population of leptin receptor-expressing neurons in the rat area postrema, which also express the progenitor marker Pdgfra. These findings suggest species-specific differences in appetite regulation. This study provides a valuable resource for investigating the intricate cellular landscape of the DVC and its role in metabolic control, with potential implications for refining obesity treatments targeting this hindbrain region.

      In line with previous work published by the PI, the topic is of clear scientific relevance, and the data presented in this manuscript are both novel and compelling. Additionally, the manuscript is well-structured, and the conclusions are robust and supported by the data. Overall, this study significantly enhances our understanding of the DVC and sheds light on key differences between rats and mice.

      I applaud the authors for the depth of their analysis. However, I have a few major concerns, comments, and suggestions that should be addressed.

      (1) If I understand the methodology correctly, mice were fasted overnight and then re-fed for 2 hours before being sacrificed (lines 91-92), which occurred 4 hours after the onset of the light phase (line 111). This means that the re-fed animals had access and consequently consumed food when they typically would not. While I completely recognize that every timepoint has its limitations, the strong influence of the circadian rhythm on the DVC gene expression (highlighted by the work published by Lukasz Chrobok), and the fact that timing of food/eating is a potent Zeitgeber, might have an impact on the analysis and should be mentioned as a potential limitation in the discussion (along with citing Dr Chrobok's work). Could this (i.e., eating during a time when the animals are not "primed by their own circadian clock to eat" potentially explain why the meal-related changes in gene expression were relatively small?

      (2) In the Materials and Methods section, LiCl is mentioned as one of the treatment conditions; however, very little corresponding data are presented or discussed. Please include these results and elaborate on the rationale for selecting LiCl over other anorectic compounds.

      (3) The number of animals used differs significantly between species, which the authors acknowledge as a limitation in the discussion. Since the authors took advantage of previously published mouse data sets (Ludwig and Dowsett data sets), I wonder if the authors could compare/integrate any rat data set currently available in rats as well to partially address the sample size disparity.

      (4) Dividing cells in AP vs NTS vs DMX clusters and analyzing potential species differences would significantly enhance the quality of the manuscript, given the partially diverse functions of these regions. This could be done by leveraging existing published datasets that employed spatial transcriptomics or more classical methodologies (e.g., PMID: 39171288, PMID: 39629676, PMID: 38092916). I would be interested to hear the authors' perspective on the feasibility of such an analysis.

      (5) Given the manuscript's focus on feeding and metabolism, I believe a more detailed description and comparison of the transcription profile of known receptors, neurotransmitters, and neuropeptides involved in food intake and energy homeostasis between mice and rats would add value. Adding a curated list of key genes related to feeding regulation would be particularly informative.

    4. Reviewer #3 (Public review):

      Summary:

      This manuscript from Cecilia H et al provides a compelling resource for single nuclei RNA sequencing data with an emphasis on facilitating the integration of future data sets across mouse and rat data sets.

      Strengths:

      There are also several interesting findings that are highlighted, even though without a functional assay the importance remains unclear. However, the manuscript properly addresses where conclusions are speculative.

      As with other snRNA seq datasets the manuscript demonstrates convincingly an increased level of complexity, while other neuronal populations like Cck and Th neurons were reproduced. Several recent findings from other groups are well addressed and put into a new context, e.g., DMV expression of AgRP (and Hcrt) was found to result from non-coding sequences, co-localization of Cck/Th was identified in a small subset. These statements are informative.

      The integration of rat data into the mouse data sets is excellent, and the comparison of cellular groups is very detailed, with interesting differences between mouse and rat data.<br /> All data sets are easily accessible and usable on open platforms, this will be an impactful resource for other researchers.

      Weaknesses:

      The data analysis seems incomplete. The title indicates the integration of mouse and rat data into a unified rodent dataset. But the discrepancy of animal numbers (30 mice vs. 2 rats) does not fit well with that focus.

      On the other hand, the mouse group is further separated into different treatments to study genetic changes that are associated with distinct energy states of fed/fasting/refeeding responses. Yet, this aspect is not addressed in depth.

      While the authors find transcriptional changes in all neuronal and non-neuronal cell types, which is interesting, the verification of known transcriptional changes (e.g., cFos) is unaddressed. cFos is a common gene upregulated with refeeding that was surprisingly not investigated, even though this should be a strong maker of proper meal-induced neuronal activation in the DMV. This is a missed opportunity either to verify the data set or to highlight important limitations if that had been attempted without success.

      Additional considerations:

      (1) The focus on transmitter classification is highlighted, but surprisingly, the well-accepted distinction of GABAergic neurons by Slc32a1 was not used, instead, Gad1 and Gad2 were used as GABAergic markers. While this may be proper for the DMV, given numerous findings that Gad1/2 are not proper markers for GABAergic neurons and often co-expressed in glutamatergic populations, this confound should have been addressed to make a case if and why they would be proper markers in the DMV.

      (2) Figure S3 for anatomical localization of clusters is excellent, but several of the cluster gene names do not have a good signal in the DMV. Specifically, the mixed neurons that do not seem to have clear marker genes. What top markers (top 10?) were used to identify these anatomical locations? At least some examples should be shown for anatomical areas to support Figure S3.

      (3) Page 15, lines 410-411: "...could not find clusters sharing all markers with our neuronal classes...". Are the authors trying to say that the DMV has more diverse neurons than other brain sites? It seems not too unusual that the hypothalamus is different from the brainstem. Maybe this could be stated more clearly, and the importance of this could be clarified.

      (4) The finding of GIRK1 astrocytes is interesting, but the emphasis that this means these astrocytes are highly/more excitable is confusing. This was not experimentally addressed and should be put into context that astrocyte activation is very different from neuronal activation. This should be better clarified in the results and discussion.

      (5) The Pdgfra IHC as verification is great, but images are not very convincing in distinguishing the 2 (mouse) or 3 (rat) classes of cells. Why not compare Pdgfra and HuC/D co-localization by IHC and snRNAseq data (using the genes for HuC/D) in the mouse and in the rat? That would also clarify how specific HuC/D is for DMV neurons, or if it may also be expressed in non-neuronal populations.

    1. eLife Assessment

      This useful study presents computational analyses of over 5,000 predicted extant and ancestral nitrogenase structures. While the data and some analyses are solid, the study remains incomplete in demonstrating that the metrics used for comparing nitrogenase structures are statistically rigorous. The data generated in this study provide a vast resource that can serve as a starting point for functional studies of reconstructed and extant nitrogenases.

    2. Reviewer #1 (Public review):

      This was a clearly written manuscript that did an excellent job summarizing complex data. In this manuscript, Cuevas-Zuviría et al. use protein modeling to generate over 5,000 predicted structures of nitrogenase components, encompassing both extant and ancestral forms across different clades. The study highlights that key insertions define the various Nif groups. The authors also examined the structures of three ancestral nitrogenase variants that had been previously identified and experimentally tested. These ancestral forms were shown in earlier studies to exhibit reduced activity in Azotobacter vinelandii, a model diazotroph.

      This work provides a useful resource for studying nitrogenase evolution. However, its impact is somewhat limited due to a lack of evidence linking the observed structural differences to functional changes. For example, in the ancestral nitrogenase structures, only a small set of residues (lines 421-431) were identified as potentially affecting interactions between nitrogenase components. Why didn't the authors test whether reverting these residues to their extant counterparts could improve nitrogenase activity of the ancestral variants?

      Additionally, the paper feels somewhat disconnected. The predicted nitrogenase structures discussed in the first half of the manuscript were not well integrated with the findings from the ancestral structures. For instance, do the ancestral nitrogenase structures align with the predicted models? This comparison was never explicitly made and could have strengthened the study's conclusions.

    3. Reviewer #2 (Public review):

      This work aims to study the evolution of nitrogenanses, understanding how their structure and function adapted to changes in the environment, including oxygen levels and changes in metal availability.

      The study predicts > 5000 structures of nitrogenases, corresponding to extant, ancestral, and alternative ancestral sequences. It is observed that structural variations in the nitrogenases correlate with phylogenetic relationships. The amount of data generated in this study represents a massive undertaking that is certain to be a resource for the community. The study also provides strong insight into how structural evolution correlates with environmental and biological phenotypes.

      The challenge with this study is that all (or nearly all) of the quantitative analyses presented are based on RMSD calculations, many of which are under 2 angstroms. For all intents and purposes, two structures with RMSD < 2 angstroms could be considered 'structurally identical'. A lot of insight generated is based on minuscule differences in RMSD, for which it is not clear that they are significantly different. The suggestion would be to find a way to evaluate the RMSD metric and determine whether these values, as obtained for structures being compared, are reliable. Some options are provided in earlier studies: PMID: 11514933, PMID: 17218333, PMID: 11420449, PMID: 8289285 (and others).

      It could also be valuable to focus more on site-specific RMSDs rather than Global RMSDs. The high conservation in the nitrogenases likely ensures that the global RMSDs will remain low across the family. Focusing on specific regions might reveal interesting differences between clades that are more informative regarding the evolution of structure in tandem with environment/time.

    4. Author response:

      Public Reviews:

      Reviewer #1 (Public review):

      This was a clearly written manuscript that did an excellent job summarizing complex data.

      In this manuscript, Cuevas-Zuviría et al. use protein modeling to generate over 5,000 predicted structures of nitrogenase components, encompassing both extant and ancestral forms across different clades. The study highlights that key insertions define the various Nif groups. The authors also examined the structures of three ancestral nitrogenase variants that had been previously identified and experimentally tested. These ancestral forms were shown in earlier studies to exhibit reduced activity in Azotobacter vinelandii, a model diazotroph. This work provides a useful resource for studying nitrogenase evolution.

      However, its impact is somewhat limited due to a lack of evidence linking the observed structural differences to functional changes. For example, in the ancestral nitrogenase structures, only a small set of residues (lines 421-431) were identified as potentially affecting interactions between nitrogenase components. Why didn't the authors test whether reverting these residues to their extant counterparts could improve nitrogenase activity of the ancestral variants?

      We thank the reviewer for their thoughtful comments. We acknowledge that our current study is primarily focused on a computational exploration of the structural differences in both extant and ancestral nitrogenase variants, which allowed us to generate a comprehensive structural dataset. Although we did not carry out experimental reversion tests in this study, we agree that directly assessing the functional consequences of reverting the specific residues (lines 420 to 429) to their extant counterparts is an important next step to elucidate their functional role. Indeed, these findings provide a valuable foundation for our future work, which is designed to include experimental characterization of these variants and further elucidate the role of critical residues in nitrogenase activity and evolution. We believe that these experiments will offer the direct functional validation that the reviewer has rightly pointed out, and we look forward to reporting on these results in a future study.

      Additionally, the paper feels somewhat disconnected. The predicted nitrogenase structures discussed in the first half of the manuscript were not well integrated with the findings from the ancestral structures. For instance, do the ancestral nitrogenase structures align with the predicted models? This comparison was never explicitly made and could have strengthened the study's conclusions.

      We thank the reviewer for this suggestion. Our original analysis (previously shown in Figure S9, now Figure S10) included insights into structural align comparisons. In response, we have reorganized the results section (lines 351-355) to explicitly address this comparison.

      Reviewer #2 (Public review):

      This work aims to study the evolution of nitrogenases, understanding how their structure and function adapted to changes in the environment, including oxygen levels and changes in metal availability. The study predicts > 5000 structures of nitrogenases, corresponding to extant, ancestral, and alternative ancestral sequences. It is observed that structural variations in the nitrogenases correlate with phylogenetic relationships. The amount of data generated in this study represents a massive undertaking that is certain to be a resource for the community. The study also provides strong insight into how structural evolution correlates with environmental and biological phenotypes.

      The challenge with this study is that all (or nearly all) of the quantitative analyses presented are based on RMSD calculations, many of which are under 2 angstroms. For all intents and purposes, two structures with RMSD < 2 angstroms could be considered 'structurally identical'. A lot of insight generated is based on minuscule differences in RMSD, for which it is not clear that they are significantly different. The suggestion would be to find a way to evaluate the RMSD metric and determine whether these values, as obtained for structures being compared, are reliable. Some options are provided in earlier studies: PMID: 11514933, PMID: 17218333, PMID: 11420449, PMID: 8289285 (and others). It could also be valuable to focus more on site-specific RMSDs rather than Global RMSDs. The high conservation in the nitrogenases likely ensures that the global RMSDs will remain low across the family. Focusing on specific regions might reveal interesting differences between clades that are more informative regarding the evolution of structure in tandem with environment/time.

      We thank the reviewer for their suggestions. We agree that while global RMSD values below 2Å typically indicate high structural similarity, relying solely on these measures can mask subtle yet potentially functionally meaningful differences. Our aim was not to test for overall structural identity but rather to quantify fine-scale variations between highly conserved nitrogenase structures, including extant and ancestral variants. Nevertheless, in light of the reviewer’s suggestions, we have implemented an additional metric ( rmsd<sub>100</sub>) for a more nuanced comparison. The results of our additional analyses (Figure S3) align closely with our original results (Figure 2), supporting our decision to retain the un-normalized results in the main text. As an additional measure, we also computed site-specific RMSDs for the active site’s environments (Figure S6) to further delineate subtle structural variations.

    1. eLife Assessment

      Examination of (a)periodic brain activity has gained particular interest in the last few years in the neuroscience fields relating to cognition, disorders, and brain states. Using large EEG/MEG datasets from younger and older adults, the current study provides compelling evidence that age-related differences in aperiodic EEG/MEG signals can be driven by cardiac rather than brain activity. Their findings have important implications for all future research that aims to assess aperiodic neural activity, suggesting control for the influence of cardiac signals is essential.

    2. Reviewer #1 (Public review):

      Summary:

      The present study addresses whether physiological signals influence aperiodic brain activity with a focus on age-related changes. The authors report age effects on aperiodic cardiac activity derived from ECG in low and high-frequency ranges in roughly 2300 participants from four different sites. Slopes of the ECGs were associated with common heart variability measures, which, according to the authors, shows that ECG, even at higher frequencies, conveys meaningful information. Using temporal response functions on concurrent ECG and M/EEG time series, the authors demonstrate that cardiac activity is instantaneously reflected in neural recordings, even after applying ICA analysis to remove cardiac activity. This was more strongly the case for EEG than MEG data. Finally, spectral parameterization was done in large-scale resting-state MEG and ECG data in individuals between 18 and 88 years, and age effects were tested. A steepening of spectral slopes with age was observed, particularly for ECG and, to a lesser extent, in cleaned MEG data in most frequency ranges and sensors investigated. The authors conclude that commonly observed age effects on neural aperiodic activity can mainly be explained by cardiac activity.

      Strengths:

      Compared to previous investigations, the authors demonstrate effects of aging on the spectral slope in the currently largest MEG dataset with equal age distribution available. Their efforts of replicating observed effects in another large MEG dataset and considering potential confounding by ocular activity, head movements, or preprocessing methods are commendable and highly valuable to the community. This study also employs a wide range of fitting ranges and two commonly used algorithms for spectral parameterization of neural and cardiac activity, hence providing a comprehensive overview of the impact of methodological choices. The authors discuss their findings in-depth and give recommendations for the separation of physiological and neural sources of aperiodic activity.

      Weaknesses:

      While the study's aim is well-motivated and analyses rigorously conducted, it remains vague what is reflected in the ECG at higher frequency ranges that contributed to the confounding of the age effects in the neural data. However, the authors address this issue in their discussion.

    3. Reviewer #2 (Public review):

      As remains obvious from my previous reviews, I still consider this to be an important paper and that is final and publishable in its current state.

      In that previous review, I revealed my identity to help reassure the authors that I was doing my best to remain unbiased because I work in this area and some of the authors' results directly impact my prior research. I was genuinely excited to see the earlier preprint version of this paper when it first appeared. I get a lot of joy out of trying to - collectively, as a field - really understand the nature of our data, and I continue to commend the authors here for pushing at the sources of aperiodic activity!

      In their manuscript, Schmidt and colleagues provide a very compelling, convincing, thorough, and measured set of analyses. Previously I recommended that the push even further, and they added the current Figure 5 analysis of event-related changes in the ECG during working memory. In my opinion this result practically warrants a separate paper its own!

      The literature analysis is very clever, and expanded upon from any other prior version I've seen.

      In my previous review, the broadest, most high-level comment I wanted to make was that authors are correct. We (in my lab) have tried to be measured in our approach to talking about aperiodic analyses - including adopting measuring ECG when possible now - because there are so many sources of aperiodic activity: neural, ECG, respiration, skin conductance, muscle activity, electrode impedances, room noise, electronics noise, etc. The authors discuss this all very clearly, and I commend them on that. We, as a field, should move more toward a model where we can account for all of those sources of noise together. (This was less of an action item, and more of an inclusion of a comment for the record.)

      I also very much appreciate the authors' excellent commentary regarding the physiological effects that pharmacological challenges such as propofol and ketamine also have on non-neural (autonomic) functions such as ECG. Previously I also asked them to discuss the possibility that, while their manuscript focuses on aperiodic activity, it is possible that the wealth of literature regarding age-related changes in "oscillatory" activity might be driven partly by age-related changes in neural (or non-neural, ECG-related) changes in aperiodic activity. They have included a nice discussion on this, and I'm excited about the possibilities for cognitive neuroscience as we move more in this direction.

      Finally, I previously asked for recommendations on how to proceed. The authors convinced me that we should care about how the ECG might impact our field potential measures, but how do I, as a relative novice, proceed. They now include three strong recommendations at the end of their manuscript that I find to be very helpful.

      As was obvious from previous review, I consider this to be an important and impactful cautionary report, that is incredibly well supported by multiple thorough analyses. The authors have done an excellent job responding to all my previous comments and concerns and, in my estimation, those of the previous reviewers as well.

    4. Reviewer #3 (Public review):

      Summary:

      Schmidt et al., aimed to provide an extremely comprehensive demonstration of the influence cardiac electromagnetic fields have on the relationship between age and the aperiodic slope measured from electroencephalographic (EEG) and magnetoencephalographic (MEG) data.

      Strengths:

      Schmidt et al., used a multiverse approach to show that the cardiac influence on this relationship is considerable, by testing a wide range of different analysis parameters (including extensive testing of different frequency ranges assessed to determine the aperiodic fit), algorithms (including different artifact reduction approaches and different aperiodic fitting algorithms), and multiple large datasets to provide conclusions that are robust to the vast majority of potential experimental variations.

      The study showed that across these different analytical variations, the cardiac contribution to aperiodic activity measured using EEG and MEG is considerable, and likely influences the relationship between aperiodic activity and age to a greater extent than the influence of neural activity.

      Their findings have significant implications for all future research that aims to assess aperiodic neural activity, suggesting control for the influence of cardiac fields is essential.

      Weaknesses:

      The authors have addressed the weaknesses of their study in their manuscript. Most alternative explanations for their results have been explored to ensure their conclusions are robust and are not explained by unexplored confounds. Minor potential weaknesses are:

      (1) The number of electrodes used in the EEG analyses was on the lower side, and as such, the results do not confirm that the influence of ECG on the 1/f activity in the EEG is high even for higher density EEG montages where ICA may provide better performance at removing cardiac components (as noted by the authors). Having noted this potential weakness, I doubt the effects of cardiac activity can be completely mitigated with current methods, even in higher-density EEG montages.

      (2) Head movements were used as a proxy for muscle activity. However, this may imperfectly address the potential influence of muscle activity on the slope in the EEG activity. As such, remaining muscle artifacts may have affected some of the results, particularly those that included high frequency ranges in the aperiodic estimate. Perhaps if muscle activity were left in the EEG data, it could have disrupted the ability to detect a relationship between age and 1/f slope in a way that didn't disrupt the same relationship in the cardiac data. However, I doubt this would reverse the overall conclusions given the number of converging results, including in lower frequency bands. The authors also note this potential weakness and suggest how future research might address it.

    5. Author response:

      The following is the authors’ response to the original reviews

      eLife Assessment

      Examination of (a)periodic brain activity has gained particular interest in the last few years in the neuroscience fields relating to cognition, disorders, and brain states. Using large EEG/MEG datasets from younger and older adults, the current study provides compelling evidence that age-related differences in aperiodic EEG/MEG signals can be driven by cardiac rather than brain activity. Their findings have important implications for all future research that aims to assess aperiodic neural activity, suggesting control for the influence of cardiac signals is essential.

      We want to thank the editors for their assessment of our work and highlighting its importance for the understanding of aperiodic neural activity. Additionally, we want to thank the three present and four former reviewers (at a different journal) whose comments and ideas were critical in shaping this manuscript to its current form. We hope that this paper opens up many more questions that will guide us - as a field - to an improved understanding of how “cortical” and “cardiac” changes in aperiodic activity are linked and want to invite readers to engage with our work through eLife’s comment function.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      The present study addresses whether physiological signals influence aperiodic brain activity with a focus on age-related changes. The authors report age effects on aperiodic cardiac activity derived from ECG in low and high-frequency ranges in roughly 2300 participants from four different sites. Slopes of the ECGs were associated with common heart variability measures, which, according to the authors, shows that ECG, even at higher frequencies, conveys meaningful information. Using temporal response functions on concurrent ECG and M/EEG time series, the authors demonstrate that cardiac activity is instantaneously reflected in neural recordings, even after applying ICA analysis to remove cardiac activity. This was more strongly the case for EEG than MEG data. Finally, spectral parameterization was done in large-scale resting-state MEG and ECG data in individuals between 18 and 88 years, and age effects were tested. A steepening of spectral slopes with age was observed particularly for ECG and, to a lesser extent, in cleaned MEG data in most frequency ranges and sensors investigated. The authors conclude that commonly observed age effects on neural aperiodic activity can mainly be explained by cardiac activity.

      Strengths:

      Compared to previous investigations, the authors demonstrate the effects of aging on the spectral slope in the currently largest MEG dataset with equal age distribution available. Their efforts of replicating observed effects in another large MEG dataset and considering potential confounding by ocular activity, head movements, or preprocessing methods are commendable and valuable to the community. This study also employs a wide range of fitting ranges and two commonly used algorithms for spectral parameterization of neural and cardiac activity, hence providing a comprehensive overview of the impact of methodological choices. Based on their findings, the authors give recommendations for the separation of physiological and neural sources of aperiodic activity.

      Weaknesses:

      While the aim of the study is well-motivated and analyses rigorously conducted, the overall structure of the manuscript, as it stands now, is partially misleading. Some of the described results are not well-embedded and lack discussion.

      We want to thank the reviewer for their comments focussed on improving the overall structure of the manuscript. We agree with their suggestions that some results could be more clearly contextualized and restructured the manuscript accordingly.

      Reviewer #2 (Public review):

      I previously reviewed this important and timely manuscript at a previous journal where, after two rounds of review, I recommended publication. Because eLife practices an open reviewing format, I will recapitulate some of my previous comments here, for the scientific record.

      In that previous review, I revealed my identity to help reassure the authors that I was doing my best to remain unbiased because I work in this area and some of the authors' results directly impact my prior research. I was genuinely excited to see the earlier preprint version of this paper when it first appeared. I get a lot of joy out of trying to - collectively, as a field - really understand the nature of our data, and I continue to commend the authors here for pushing at the sources of aperiodic activity!

      In their manuscript, Schmidt and colleagues provide a very compelling, convincing, thorough, and measured set of analyses. Previously I recommended that the push even further, and they added the current Figure 5 analysis of event-related changes in the ECG during working memory. In my opinion this result practically warrants a separate paper its own!

      The literature analysis is very clever, and expanded upon from any other prior version I've seen.

      In my previous review, the broadest, most high-level comment I wanted to make was that authors are correct. We (in my lab) have tried to be measured in our approach to talking about aperiodic analyses - including adopting measuring ECG when possible now - because there are so many sources of aperiodic activity: neural, ECG, respiration, skin conductance, muscle activity, electrode impedances, room noise, electronics noise, etc. The authors discuss this all very clearly, and I commend them on that. We, as a field, should move more toward a model where we can account for all of those sources of noise together. (This was less of an action item, and more of an inclusion of a comment for the record.)

      I also very much appreciate the authors' excellent commentary regarding the physiological effects that pharmacological challenges such as propofol and ketamine also have on non-neural (autonomic) functions such as ECG. Previously I also asked them to discuss the possibility that, while their manuscript focuses on aperiodic activity, it is possible that the wealth of literature regarding age-related changes in "oscillatory" activity might be driven partly by age-related changes in neural (or non-neural, ECG-related) changes in aperiodic activity. They have included a nice discussion on this, and I'm excited about the possibilities for cognitive neuroscience as we move more in this direction.

      Finally, I previously asked for recommendations on how to proceed. The authors convinced me that we should care about how the ECG might impact our field potential measures, but how do I, as a relative novice, proceed. They now include three strong recommendations at the end of their manuscript that I find to be very helpful.

      As was obvious from previous review, I consider this to be an important and impactful cautionary report, that is incredibly well supported by multiple thorough analyses. The authors have done an excellent job responding to all my previous comments and concerns and, in my estimation, those of the previous reviewers as well.

      We want to thank the reviewer for agreeing to review our manuscript again and for recapitulating on their previous comments and the progress the manuscript has made over the course of the last ~2 years. The reviewer's comments have been essential in shaping the manuscript into its current form. Their feedback has made the review process truly feel like a collaborative effort, focused on strengthening the manuscript and refining its conclusions and resulting recommendations.

      Reviewer #3 (Public review):

      Summary:

      Schmidt et al., aimed to provide an extremely comprehensive demonstration of the influence cardiac electromagnetic fields have on the relationship between age and the aperiodic slope measured from electroencephalographic (EEG) and magnetoencephalographic (MEG) data.

      Strengths:

      Schmidt et al., used a multiverse approach to show that the cardiac influence on this relationship is considerable, by testing a wide range of different analysis parameters (including extensive testing of different frequency ranges assessed to determine the aperiodic fit), algorithms (including different artifact reduction approaches and different aperiodic fitting algorithms), and multiple large datasets to provide conclusions that are robust to the vast majority of potential experimental variations.

      The study showed that across these different analytical variations, the cardiac contribution to aperiodic activity measured using EEG and MEG is considerable, and likely influences the relationship between aperiodic activity and age to a greater extent than the influence of neural activity.

      Their findings have significant implications for all future research that aims to assess aperiodic neural activity, suggesting control for the influence of cardiac fields is essential.

      We want to thank the reviewer for their thorough engagement with our work and the resultant substantive amount of great ideas both mentioned in the section of Weaknesses and Authors Recommendations below. Their suggestions have sparked many ideas in us on how to move forward in better separating peripheral- from neuro-physiological signals that are likely to greatly influence our future attempts to better extract both cardiac and muscle activity from M/EEG recordings. So we want to thank them for their input, time and effort!

      Weaknesses:

      Figure 4I: The regressions explained here seem to contain a very large number of potential predictors. Based on the way it is currently written, I'm assuming it includes all sensors for both the ECG component and ECG rejected conditions?

      I'm not sure about the logic of taking a complete signal, decomposing it with ICA to separate out the ECG and non-ECG signals, then including these latent contributions to the full signal back into the same regression model. It seems that there could be some circularity or redundancy in doing so. Can the authors provide a justification for why this is a valid approach?

      After observing significant effects both in the MEG<sub>ECG component</sub> and MEG<sub>ECG rejected</sub> conditions in similar frequency bands we wanted to understand whether or not these age-related changes are statistically independent. To test this we added both variables as predictors in a regression model (thereby accounting for the influence of the other in relation to age). The regression models we performed were therefore actually not very complex. They were built using only two predictors, namely the data (in a specific frequency range) averaged over channels on which we noticed significant effects in the ECG rejected and ECG components data respectively (Wilkinson notation: age ~ 1 + ECG rejected + ECG components). This was also described in the results section stating that: “To see if MEG<sub>ECG rejected</sub> and MEG<sub>ECG component</sub> explain unique variance in aging at frequency ranges where we noticed shared effects, we averaged the spectral slope across significant channels and calculated a multiple regression model with MEG<sub>ECG component</sub> and MEG<sub>ECG rejected</sub> as predictors for age (to statistically control for the effect of MEG<sub>ECG component</sub>s and MEG<sub>ECG rejected</sub> on age). This analysis was performed to understand whether the observed shared age-related effects (MEG<sub>ECG rejected</sub> and MEG<sub>ECG component</sub>) are in(dependent).”  

      We hope this explanation solves the previous misunderstanding.

      I'm not sure whether there is good evidence or rationale to support the statement in the discussion that the presence of the ECG signal in reference electrodes makes it more difficult to isolate independent ECG components. The ICA algorithm will still function to detect common voltage shifts from the ECG as statistically independent from other voltage shifts, even if they're spread across all electrodes due to the referencing montage. I would suggest there are other reasons why the ICA might lead to imperfect separation of the ECG component (assumption of the same number of source components as sensors, non-Gaussian assumption, assumption of independence of source activities).

      The inclusion of only 32 channels in the EEG data might also have reduced the performance of ICA, increasing the chances of imperfect component separation and the mixing of cardiac artifacts into the neural components, whereas the higher number of sensors in the MEG data would enable better component separation. This could explain the difference between EEG and MEG in the ability to clean the ECG artifact (and perhaps higher-density EEG recordings would not show the same issue).

      The reviewer is making a good argument suggesting that our initial assumption that the presence of cardiac activity on the reference electrode influences the performance of the ICA may be wrong. After rereading and rethinking upon the matter we think that the reviewer is correct and that their assumptions for why the ECG signal was not so easily separable from our EEG recordings are more plausible and better grounded in the literature than our initial suggestion. We therefore now highlight their view as a main reason for why the ECG rejection was more challenging in EEG data. However, we also note that understanding the exact reason probably ends up being an empirical question that demands further research stating that:

      “Difficulties in removing ECG related components from EEG signals via ICA might be attributable to various reasons such as the number of available sensors or assumptions related to the non-gaussianity of the underlying sources. Further understanding of this matter is highly important given that ICA is the most widely used procedure to separate neural from peripheral physiological sources. ”

      In addition to the inability to effectively clean the ECG artifact from EEG data, ICA and other component subtraction methods have also all been shown to distort neural activity in periods that aren't affected by the artifact due to the ubiquitous issue of imperfect component separation (https://doi.org/10.1101/2024.06.06.597688). As such, component subtraction-based (as well as regression-based) removal of the cardiac artifact might also distort the neural contributions to the aperiodic signal, so even methods to adequately address the cardiac artifact might not solve the problem explained in the study. This poses an additional potential confound to the "M/EEG without ECG" conditions.

      The reviewer is correct in stating that, if an “artifactual” signal is not always present but appears and disappears (like e.g. eye-blinks) neural activity may be distorted in periods where the “artifactual” signal is absent. However, while this plausibly presents a problem for ocular activity, there is no obvious reason to believe that this applies to cardiac activity. While the ECG signal is non-stationary in nature, it is remarkably more stable than eye-movements in the healthy populations we analyzed (especially at rest). Therefore, the presence of the cardiac “artifact” was consistently present across the entirety of the MEG recordings we visually inspected.

      Literature Analysis, Page 23: was there a method applied to address studies that report reducing artifacts in general, but are not specific to a single type of artifact? For example, there are automated methods for cleaning EEG data that use ICLabel (a machine learning algorithm) to delete "artifact" components. Within these studies, the cardiac artifact will not be mentioned specifically, but is included under "artifacts".

      The literature analysis was largely performed automatically and solely focussed on ECG related activity as described in the methods section under Literature Analysis, if no ECG related terms were used in the context of artifact rejection a study was flagged as not having removed cardiac activity. This could have been indeed better highlighted by us and we apologize for the oversight on our behalf. We now additionally link to these details stating that:

      “However, an analysis of openly accessible M/EEG articles (N<sub>Articles</sub>=279; see Methods - Literature Analysis for further details) that investigate aperiodic activity revealed that only 17.1% of EEG studies explicitly mention that cardiac activity was removed and only 16.5% measure ECG (45.9% of MEG studies removed cardiac activity and 31.1% of MEG studies mention that ECG was measured; see Figure 1EF).”

      The reviewer makes a fair point that there is some uncertainty here and our results probably present a lower bound of ECG handling in M/EEG research as, when I manually rechecked the studies that were not initially flagged in studies it was often solely mentioned that “artifacts” were rejected. However, this information seemed too ambiguous to assume that cardiac activity was in fact accounted for. However, again this could have been mentioned more clearly in writing and we apologize for this oversight. Now this is included as part of the methods section Literature Analysis stating that:

      “All valid word contexts were then manually inspected by scanning the respective word context to ensure that the removal of “artifacts” was related specifically to cardiac and not e.g. ocular activity or the rejection of artifacts in general (without specifying which “artifactual” source was rejected in which case the manuscript was marked as invalid). This means that the results of our literature analysis likely present a lower bound for the rejection of cardiac activity in the M/EEG literature investigating aperiodic activity.”

      Statistical inferences, page 23: as far as I can tell, no methods to control for multiple comparisons were implemented. Many of the statistical comparisons were not independent (or even overlapped with similar analyses in the full analysis space to a large extent), so I wouldn't expect strong multiple comparison controls. But addressing this point to some extent would be useful (or clarifying how it has already been addressed if I've missed something).

      In the present study we tried to minimize the risk of type 1 errors by several means, such as A) weakly informative priors, B) robust regression models and C) by specifying a region of practical equivalence (ROPE, see Methods Statistical Inference for further Information) to define meaningful effects.

      Weakly informative priors can lower the risk of type 1 errors arising from multiple testing by shrinking parameter estimates towards zero (see e.g. Lemoine, 2019). Robust regression models use a Student T distribution to describe the distribution of the data. This distribution features heavier tails, meaning it allocates more probability to extreme values, which in turn minimizes the influence of outliers. The ROPE criterion ensures that only effects exceeding a negligible size are considered meaningful, representing a strict and conservative approach to interpreting our findings (see Kruschke 2018, Cohen, 1988).

      Furthermore, and more generally we do not selectively report “significant” effects in the situations in which multiple analyses were conducted on the same family of data (e.g. Figure 2 & 4). Instead we provide joint inference across several plausible analysis options (akin to a specification curve analysis, Simonsohn, Simmons & Nelson 2020) to provide other researchers with an overview of how different analysis choices impact the association between cardiac and neural aperiodic activity.

      Lemoine, N. P. (2019). Moving beyond noninformative priors: why and how to choose weakly informative priors in Bayesian analyses. Oikos, 128(7), 912-928.

      Simonsohn, U., Simmons, J. P., & Nelson, L. D. (2020). Specification curve analysis. Nature Human Behaviour, 4(11), 1208-1214.

      Methods:

      Applying ICA components from 1Hz high pass filtered data back to the 0.1Hz filtered data leads to worse artifact cleaning performance, as the contribution of the artifact in the 0.1Hz to 1Hz frequency band is not addressed (see Bailey, N. W., Hill, A. T., Biabani, M., Murphy, O. W., Rogasch, N. C., McQueen, B., ... & Fitzgerald, P. B. (2023). RELAX part 2: A fully automated EEG data cleaning algorithm that is applicable to Event-Related-Potentials. Clinical Neurophysiology, result reported in the supplementary materials). This might explain some of the lower frequency slope results (which include a lower frequency limit <1Hz) in the EEG data - the EEG cleaning method is just not addressing the cardiac artifact in that frequency range (although it certainly wouldn't explain all of the results).

      We want to thank the reviewer for suggesting this interesting paper, showing that lower high-pass filters may be preferable to the more commonly used >1Hz high-pass filters for detection of ICA components that largely contain peripheral physiological activity. However, the results presented by Bailey et al. contradict the more commonly reported findings by other researchers that >1Hz high-pass filter is actually preferable (e.g. Winkler et al. 2015; Dimingen, 2020 or Klug & Gramann, 2021) and recommendations in widely used packages for M/EEG analysis (e.g. https://mne.tools/1.8/generated/mne.preprocessing.ICA.html). Yet, the fact that there seems to be a discrepancy suggests that further research is needed to better understand which type of high-pass filtering is preferable in which situation. Furthermore, it is notable that all the findings for high-pass filtering in ICA component detection and removal that we are aware of relate to ocular activity. Given that ocular and cardiac activity have very different temporal and spectral patterns it is probably worth further investigating whether the classic 1Hz high-pass filter is really also the best option for the detection and removal of cardiac activity. However, in our opinion this requires a dedicated investigation on its own..

      We therefore highlight this now in our manuscript stating that:

      “Additionally, it is worth noting that the effectiveness of an ICA crucially depends on the quality of the extracted components(63,64) and even widely suggested settings e.g. high-pass filtering at 1Hz before fitting an ICA may not be universally applicable (see supplementary material of (64)).

      Winkler, S. Debener, K. -R. Müller and M. Tangermann, "On the influence of high-pass filtering on ICA-based artifact reduction in EEG-ERP," 2015 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Milan, Italy, 2015, pp. 4101-4105, doi: 10.1109/EMBC.2015.7319296.

      Dimigen, O. (2020). Optimizing the ICA-based removal of ocular EEG artifacts from free viewing experiments. NeuroImage, 207, 116117.

      Klug, M., & Gramann, K. (2021). Identifying key factors for improving ICA‐based decomposition of EEG data in mobile and stationary experiments. European Journal of Neuroscience, 54(12), 8406-8420.

      It looks like no methods were implemented to address muscle artifacts. These can affect the slope of EEG activity at higher frequencies. Perhaps the Riemannian Potato addressed these artifacts, but I suspect it wouldn't eliminate all muscle activity. As such, I would be concerned that remaining muscle artifacts affected some of the results, particularly those that included high frequency ranges in the aperiodic estimate. Perhaps if muscle activity were left in the EEG data, it could have disrupted the ability to detect a relationship between age and 1/f slope in a way that didn't disrupt the same relationship in the cardiac data (although I suspect it wouldn't reverse the overall conclusions given the number of converging results including in lower frequency bands). Is there a quick validity analysis the authors can implement to confirm muscle artifacts haven't negatively affected their results?

      I note that an analysis of head movement in the MEG is provided on page 32, but it would be more robust to show that removing ICA components reflecting muscle doesn't change the results. The results/conclusions of the following study might be useful for objectively detecting probable muscle artifact components: Fitzgibbon, S. P., DeLosAngeles, D., Lewis, T. W., Powers, D. M. W., Grummett, T. S., Whitham, E. M., ... & Pope, K. J. (2016). Automatic determination of EMG-contaminated components and validation of independent component analysis using EEG during pharmacologic paralysis. Clinical neurophysiology, 127(3), 1781-1793.

      We thank the reviewer for their suggestion. Muscle activity can indeed be a potential concern, for the estimation of the spectral slope. This is precisely why we used head movements (as also noted by the reviewer) as a proxy for muscle activity. We also agree with the reviewer that this is not a perfect estimate. Additionally, also the riemannian potato would probably only capture epochs that contain transient, but not persistent patterns of muscle activity.

      The paper recommended by the reviewer contains a clever approach of using the steepness of the spectral slope (or lack thereof) as an indicator whether or not an independent component (IC) is driven by muscle activity. In order to determine an optimal threshold Fitzgibbon et al. compared paralyzed to temporarily non paralyzed subjects. They determined an expected “EMG-free” threshold for their spectral slope on paralyzed subjects and used this as a benchmark to detect IC’s that were contaminated by muscle activity in non paralyzed subjects.

      This is a great idea, but unfortunately would go way beyond what we are able to sensibly estimate with our data for the following reasons. The authors estimated their optimal threshold on paralyzed subjects for EEG data and show that this is a feasible threshold to be applied across different recordings. So for EEG data it might be feasible, at least as a first shot, to use their threshold on our data. However, we are measuring MEG and as alluded to in our discussion section under “Differences in aperiodic activity between magnetic and electric field recordings” the spectral slope differs greatly between MEG and EEG recordings for non-trivial reasons. Furthermore, the spectral slope even seems to also differ across different MEG devices. We noticed this when we initially tried to pool the data recorded in Salzburg with the Cambridge dataset. This means we would need to do a complete validation of this procedure for the MEG data recorded in Cambridge and in Salzburg, which is not feasible considering that we A) don’t have direct access to one of the recording sites and B) would even if we had access face substantial hurdles to get ethical approval for the experiment performed by Fitzgibbon et al..

      However, we think the approach brought forward by Fitzgibbon and colleagues is a clever way to remove muscle activity from EEG recordings, whenever EMG was not directly recorded. We therefore suggested in the Discussion section that ideally also EMG should be recorded stating that:

      “It is worth noting that, apart from cardiac activity, muscle activity can also be captured in (non-)invasive recordings and may drastically influence measures of the spectral slope(72). To ensure that persistent muscle activity does not bias our results we used changes in head movement velocity as a control analysis (see Supplementary Figure S9). However, it should be noted that this is only a proxy for the presence of persistent muscle activity. Ideally, studies investigating aperiodic activity should also be complemented by measurements of EMG. Whenever such measurements are not available creative approaches that use the steepness of the spectral slope (or the lack thereof) as an indicator to detect whether or not e.g. an independent component is driven by muscle activity are promising(72,73). However, these approaches may require further validation to determine how well myographic aperiodic thresholds are transferable across the wide variety of different M/EEG devices.”

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      (1) As outlined above, I recommend rephrasing the last section of the introduction to briefly summarize/introduce all main analysis steps undertaken in the study and why these were done (for example, it is only mentioned that the Cam-CAN dataset was used to study the impact of cardiac on MEG activity although the author used a variety of different datasets). Similarly, I am missing an overview of all main findings in the context of the study goals in the discussion. I believe clarifying the structure of the paper would not only provide a red thread to the reader but also highlight the efforts/strength of the study as described above.

      This is a good call! As suggested by the reviewer we now try to give a clearer overview of what was investigated why. We do that both at the end of the introduction stating that: “Using the publicly available Cam-CAN dataset(28,29), we find that the aperiodic signal measured using M/EEG originates from multiple physiological sources. In particular, significant portions of age-related changes in aperiodic activity –normally attributed to neural processes– can be better explained by cardiac activity. This observation holds across a wide range of processing options and control analyses (see Supplementary S1), and was replicable on a separate MEG dataset. However, the extent to which cardiac activity accounts for age-related changes in aperiodic activity varies with the investigated frequency range and recording site. Importantly, in some frequency ranges and sensor locations, age-related changes in neural aperiodic activity still prevail. But does the influence of cardiac activity on the aperiodic spectrum extend beyond age? In a preliminary analysis, we demonstrate that working memory load modulates the aperiodic spectrum of “pure” ECG recordings. The direction of this working memory effect mirrors previous findings on EEG data(5) suggesting that the impact of cardiac activity goes well beyond aging. In sum, our results highlight the complexity of aperiodic activity while cautioning against interpreting it as solely “neural“ without considering physiological influences.”

      and at the beginning of the discussion section:

      “Difficulties in removing ECG related components from EEG signals via ICA might be attributable to various reasons such as the number of available sensors or assumptions related to the non-gaussianity of the underlying sources. Further understanding of this matter is highly important given that ICA is the most widely used procedure to separate neural from peripheral physiological sources (see Figure 1EF). Additionally, it is worth noting that the effectiveness of an ICA crucially depends on the quality of the extracted components(63,64) and even widely suggested settings e.g. high-pass filtering at 1Hz before fitting an ICA may not be universally applicable (see supplementary material of (64)). “

      (2) I found it interesting that the spectral slopes of ECG activity at higher frequency ranges (> 10 Hz) seem mostly related to HRV measures such as fractal and time domain indices and less so with frequency-domain indices. Do the authors have an explanation for why this is the case? Also, the analysis of the HRV measures and their association with aperiodic ECG activity is not explained in any of the method sections.

      We apologize for the oversight in not mentioning the HRV analysis in more detail in our methods section. We added a subsection to the Methods section entitled ECG Processing - Heart rate variability analysis to further describe the HRV analyses.

      “ECG Processing - Heart rate variability analysis

      Heart rate variability (HRV) was computed using the NeuroKit2 toolbox, a high level tool for the analysis of physiological signals. First, the raw electrocardiogram (ECG) data were preprocessed, by highpass filtering the signal at 0.5Hz using an infinite impulse response (IIR) butterworth filter(order=5) and by smoothing the signal with a moving average kernel with the width of one period of 50Hz to remove the powerline noise (default settings of neurokit.ecg.ecg_clean). Afterwards, QRS complexes were detected based on the steepness of the absolute gradient of the ECG signal. Subsequently, R-Peaks were detected as local maxima in the QRS complexes (default settings of neurokit.ecg.ecg_peaks; see (98) for a validation of the algorithm). From the cleaned R-R intervals, 90 HRV indices were derived, encompassing time-domain, frequency-domain, and non-linear measures. Time-domain indices included standard metrics such as the mean and standard deviation of the normalized R-R intervals , the root mean square of successive differences, and other statistical descriptors of interbeat interval variability. Frequency-domain analyses were performed using power spectral density estimation, yielding for instance low frequency (0.04-0.15Hz) and high frequency (0.15-0.4Hz) power components. Additionally, non-linear dynamics were characterized through measures such as sample entropy, detrended fluctuation analysis and various Poincaré plot descriptors. All these measures were then related to the slopes of the low frequency (0.25 – 20 Hz) and high frequency (10 – 145 Hz) aperiodic spectrum of the raw ECG.”

      With regards to association of the ECG’s spectral slopes at high frequencies and frequency domain indices of heart rate variability. Common frequency domain indices of heart rate variability fall in the range of 0.01-.4Hz. Which probably explains why we didn’t notice any association at higher frequency ranges (>10Hz).

      This is also stated in the related part of the results section:

      “In the higher frequency ranges (10 - 145 Hz) spectral slopes were most consistently related to fractal and time domain indices of heart rate variability, but not so much to frequency-domain indices assessing spectral power in frequency ranges < 0.4 Hz.”

      (3) Related to the previous point - what is being reflected in the ECG at higher frequency ranges, with regard to biological mechanisms? Results are being mentioned, but not further discussed. However, this point seems crucial because the age effects across the four datasets differ between low and high-frequency slope limits (Figure 2C).

      This is a great question that definitely also requires further attention and investigation in general (see also Tereshchenko & Josephson, 2015). We investigated the change of the slope across frequency ranges that are typically captured in common ECG setups for adults (0.05 - 150Hz, Tereshchenko & Josephson, 2015; Kusayama, Wong, Liu et al. 2020). While most of the physiological significant spectral information of an ECG recording rests between 1-50Hz (Clifford & Azuaje, 2006), meaningful information can be extracted at much higher frequencies. For instance, ventricular late potentials have a broader frequency band (~40-250Hz) that falls straight in our spectral analysis window. However, that’s not all, as further meaningful information can be extracted at even higher frequencies (>100Hz). Yet, the exact physiological mechanisms underlying so-called high-frequency QRS remain unclear (HF-QRS; see Tereshchenko & Josephson, 2015; Qiu et al. 2024 for a review discussing possible mechanisms). Yet, at the same time the HF-QRS seems to be highly informative for the early detection of myocardial ischemia and other cardiac abnormalities that may not yet be evident in the standard frequency range (Schlegel et al. 2004; Qiu et al. 2024). All optimism aside, it is also worth noting that ECG recordings at higher frequencies can capture skeletal muscle activity with an overlapping frequency range up to 400Hz (Kusayama, Wong, Liu et al. 2020). We highlight all of this now when introducing this analysis in the results sections as outstanding research question stating that:

      “However, substantially less is known about aperiodic activity above 0.4Hz in the ECG. Yet, common ECG setups for adults capture activity at a broad bandwidth of 0.05 - 150Hz(33,34).

      Importantly, a lot of the physiological meaningful spectral information rests between 1-50Hz(35), similarly to M/EEG recordings. Furthermore, meaningful information can be extracted at much higher frequencies. For instance, ventricular late potentials have a broader frequency band (~40-250Hz(35)). However, that’s not all, as further meaningful information can be extracted at even higher frequencies (>100Hz). For instance, the so-called high-frequency QRS seems to be highly informative for the early detection of myocardial ischemia and other cardiac abnormalities that may not yet be evident in the standard frequency range(36,37). Yet, the exact physiological mechanisms underlying the high-frequency QRS remain unclear (see (37) for a review discussing possible mechanisms). ”

      Tereshchenko, L. G., & Josephson, M. E. (2015). Frequency content and characteristics of ventricular conduction. Journal of electrocardiology, 48(6), 933-937.

      Kusayama, T., Wong, J., Liu, X. et al. Simultaneous noninvasive recording of electrocardiogram and skin sympathetic nerve activity (neuECG). Nat Protoc 15, 1853–1877 (2020). https://doi.org/10.1038/s41596-020-0316-6

      Clifford, G. D., & Azuaje, F. (2006). Advanced methods and tools for ECG data analysis (Vol. 10). P. McSharry (Ed.). Boston: Artech house.

      Qiu, S., Liu, T., Zhan, Z., Li, X., Liu, X., Xin, X., ... & Xiu, J. (2024). Revisiting the diagnostic and prognostic significance of high-frequency QRS analysis in cardiovascular diseases: a comprehensive review. Postgraduate Medical Journal, qgae064.

      Schlegel, T. T., Kulecz, W. B., DePalma, J. L., Feiveson, A. H., Wilson, J. S., Rahman, M. A., & Bungo, M. W. (2004, March). Real-time 12-lead high-frequency QRS electrocardiography for enhanced detection of myocardial ischemia and coronary artery disease. In Mayo Clinic Proceedings (Vol. 79, No. 3, pp. 339-350). Elsevier.

      (4) Page 10: At first glance, it is not quite clear what is meant by "processing option" in the text. Please clarify.

      Thank you for catching this! Upon re-reading this is indeed a bit oblivious. We now swapped “processing options” with “slope fits” to make it clearer that we are talking about the percentage of effects based on the different slope fits.

      (5) The authors mention previous findings on age effects on neural 1/f activity (References Nr 5,8,27,39) that seem contrary to their own findings such as e.g., the mostly steepening of the slopes with age. Also, the authors discuss thoroughly why spectral slopes derived from MEG signals may differ from EEG signals. I encourage the authors to have a closer look at these studies and elaborate a bit more on why these studies differ in their conclusions on the age effects. For example, Tröndle et al. (2022, Ref. 39) investigated neural activity in children and young adults, hence, focused on brain maturation, whereas the CamCAN set only considers the adult lifespan. In a similar vein, others report age effects on 1/f activity in much smaller samples as reported here (e.g., Voytek et al., 2015).

      I believe taking these points into account by briefly discussing them, would strengthen the authors' claims and provide a more fine-grained perspective on aging effects on 1/f.

      The reviewer is making a very important point. As age-related differences in (neuro-)physiological activity are not necessarily strictly comparable and entirely linear across different age-cohorts (e.g. age-related changes in alpha center frequency). We therefore, added the suggested discussion points to the discussion section.

      “Differences in electric and magnetic field recordings aside, aperiodic activity may not change strictly linearly as we are ageing and studies looking at younger age groups (e.g. <22; (44) may capture different aspects of aging (e.g. brain maturation), than those looking at older subjects (>18 years; our sample). A recent report even shows some first evidence of an interesting putatively non-linear relationship with age in the sensorimotor cortex for resting recordings(59)”

      (6) The analysis of the working memory paradigm as described in the outlook-section of the discussion comes as a bit of a surprise as it has not been introduced before. If the authors want to convey with this study that, in general, aperiodic neural activity could be influenced by aperiodic cardiac activity, I recommend introducing this analysis and the results earlier in the manuscript than only in the discussion to strengthen their message.

      The reviewer is correct. This analysis really comes a bit out of the blue. However, this was also exactly the intention for placing this analysis in the discussion. As the reviewer correctly noted, the aim was to suggest “that, in general, aperiodic neural activity could be influenced by aperiodic cardiac activity”. We placed this outlook directly after the discussion of “(neuro-)physiological origins of aperiodic activity”, where we highlight the potential challenges of interpreting drug induced changes to M/EEG recordings. So the aim was to get the reader to think about whether age is the only feature affected by cardiac activity and then directly present some evidence that this might go beyond age.

      However, we have been rethinking this approach based on the reviewers comments and moved that paragraph to the end of the results section accordingly and introduce it already at the end of the introduction stating that:

      “But does the influence of cardiac activity on the aperiodic spectrum extend beyond age? In a preliminary analysis, we demonstrate that working memory load modulates the aperiodic spectrum of “pure” ECG recordings. The direction of this working memory effect mirrors previous findings on EEG data(5) suggesting that the impact of cardiac activity goes well beyond aging.”

      (7) The font in Figure 2 is a bit hard to read (especially in D). I recommend increasing the font sizes where necessary for better readability.

      We agree with the Reviewer and increased the font sizes accordingly.

      (8) Text in the discussion: Figure 3B on page 10 => shouldn't it be Figure 4?

      Thank you for catching this oversight. We have now corrected this mistake.

      (9) In the third section on page 10, the Figure labels seem to be confused. For example, Figure 4 E is supposed to show "steepening effects", which should be Figure 4B I believe.

      Please check the figure labels in this section to avoid confusion.

      Thank you for catching this oversight. We have now corrected this mistake.

      (10) Figure Legend 4 I), please check the figure labels in the text

      Thank you for catching this oversight. We have now corrected this mistake.

      Reviewer #3 (Recommendations for the authors):

      I have a number of suggestions for improving the manuscript, which I have divided by section in the following:

      ABSTRACT:

      I would suggest re-writing the first sentences to make it easier to read for non-expert readers: "The power of electrophysiologically measured cortical activity decays with an approximately 1/fX function. The slope of this decay (i.e. the spectral exponent, X) is modulated..."

      Thank you for the suggestion. We adjusted the sentence as suggested to make it easier for less technical readers to understand that “X” refers to the exponent.

      Including the age range that was studied in the abstract could be informative.

      Done as suggested.

      As an optional recommendation, I think it would increase the impact of the article if the authors note in the abstract that the current most commonly applied cardiac artifact reduction approaches don't resolve the issue for EEG data, likely due to an imperfect ability to separate the cardiac artifact from the neural activity with independent component analysis. This would highlight to the reader that they can't just expect to address these concerns by cleaning their data with typical cleaning methods.

      I think it would also be useful to convey in the abstract just how comprehensive the included analyses were (in terms of artifact reduction methods tested, different aperiodic algorithms and frequency ranges, and both MEG and EEG). Doing so would let the reader know just how robust the conclusions are likely to be.

      This is a brilliant idea! As suggested we added a sentence highlighting that simply performing an ICA may not be sufficient to separate cardiac contributions to M/EEG recordings and refer to the comprehensiveness of the performed analyses.

      INTRODUCTION:

      I would suggest re-writing the following sentence for readability: "In the past, aperiodic neural activity, other than periodic neural activity (local peaks that rise above the "power-law" distribution), was often treated as noise and simply removed from the signal"

      To something like: "In the past, aperiodic neural activity was often treated as noise and simply removed from the signal e.g. via pre-whitening, so that analyses could focus on periodic neural activity (local peaks that rise above the "power-law" distribution, which are typically thought to reflect neural oscillations).

      We are happy to follow that suggestion.

      Page 3: please provide the number of articles that were included in the examination of the percentage that remove cardiac activity, and note whether the included articles could be considered a comprehensive or nearly comprehensive list, or just a representative sample.

      We stated the exact number of articles in the methods section under Literature Analysis. However, we added it to the Introduction on page 3 as suggested by the reviewer. The selection of articles was done automatically, dependent on a list of pre-specified terms and exclusively focussed on articles that had terms related to aperiodic activity in their title (see Literature Analysis). Therefore, I would personally be hesitant in calling it a comprehensive or nearly comprehensive list of the general M/EEG literature as the analysis of aperiodic activity is still relatively niche compared to the more commonly investigated evoked potentials or oscillations. I think whether or not a reader perceives our analysis as comprehensive should be up to them to decide and does not reflect something I want to impose on them. This is exacerbated by the fact that the analysis of neural aperiodic activity has rapidly gained traction over the last years (see Figure 1D orange) and the literature analysis was performed almost 2 years ago and therefore, in my eyes, only represents a glimpse in the rapidly evolving field related to the analysis of aperiodic activity.

      Figure 1E-F: It's not completely clear that the "Cleaning Methods" part of the figure indicates just methods to clean the cardiac artifact (rather than any artifact). It also seems that ~40% of EEG studies do not apply any cleaning methods even from within the studies that do clean the cardiac artifact (if I've read the details correctly). This seems unlikely. Perhaps there should be a bar for "other methods", or "unspecified"? Having said that, I'm quite familiar with the EEG artifact reduction literature, and I would be very surprised if ~40% of studies cleaned the cardiac artifact using a different method to the methods listed in the bar graph, so I'm wondering if I've misunderstood the figure, or whether the data capture is incomplete / inaccurate (even though the conclusion that ICA is the most common method is almost certainly accurate).

      The cleaning is indeed only focussed on cardiac activity specifically. This was however also mentioned in the caption of Figure 1: “We were further interested in determining which artifact rejection approaches were most commonly used to remove cardiac activity, such as independent component analysis (ICA(22)), singular value decomposition (SVD(23)), signal space separation (SSS(24)), signal space projections (SSP(25)) and denoising source separation (DSS(26)).” and in the methods section under Literature Analysis. However, we adjusted figure 1EF to make it more obvious that the described cleaning methods were only related to the ECG. Aside from using blind source separation techniques such as ICA a good amount of studies mentioned that they cleaned their data based on visual inspection (which was not further considered). Furthermore, it has to be noted that only studies were marked as having separated cardiac from neural activity, when this was mentioned explicitly.

      RESULTS:

      Page 6: I would delete the "from a neurophysiological perspective" clause, which makes the sentence more difficult to read and isn't so accurate (frequencies 13-25Hz would probably more commonly be considered mid-range rather than low or high). Additionally, both frequency ranges include 15Hz, but the next sentence states that the ranges were selected to avoid the knee at 15Hz, which seems to be a contradiction. Could the authors explain in more detail how the split addresses the 15Hz knee?

      We removed the “from a neurophysiological perspective” clause as suggested. With regards to the “knee” at ~15Hz I would like to defer the reviewer to Supplementary Figure S1. The Knee Frequency varies substantially across subjects so splitting the data at only 1 exact Frequency did not seem appropriate. Additionally, we found only spurious significant age-related variations in Knee Frequency (i.e. only one out of the 4 datasets; not shown).

      Furthermore, we wanted to better connect our findings to our MEG results in Figure 4 and also give the readers a holistic overview of how different frequency ranges in the aperiodic ECG would be affected by age. So to fulfill all of these objectives we decided to fit slopes with respective upper/lower bounds around a range of 5Hz above and below the average 15Hz Knee Frequency across datasets.

      The later parts of this same paragraph refer to a vast amount of different frequency ranges, but only the "low" and "high" frequency ranges were previously mentioned. Perhaps the explanation could be expanded to note that multiple lower and upper bounds were tested within each of these low and high frequency windows?

      This is a good catch we adjusted the sentence as suggested. We now write: “.. slopes were fitted individually to each subject's power spectrum in several lower (0.25 – 20 Hz) and higher (10-145 Hz) frequency ranges.”

      The following two sentences seem to contradict each other: "Overall, spectral slopes in lower frequency ranges were more consistently related to heart rate variability indices(> 39.4% percent of all investigated indices)" and: "In the lower frequency range (0.25 - 20Hz), spectral slopes were consistently related to most measures of heart rate variability; i.e. significant effects were detected in all 4 datasets (see Figure 2D)." (39.4% is not "most").

      The reviewer is correct in stating that 39.4% is not most. However, the 39.4% is the lowest bound and only refers to 1 dataset. In the other 3 datasets the percentage of effects was above 64% which can be categorized as “most” i.e. above 50%. We agree that this was a bit ambiguous in the sentence so we added the other percentages as well as a reference to Figure 2D to make this point clearer.

      Figure 2D: it isn't clear what the percentages in the semi-circles reflect, nor why some semi-circles are more full circles while others are only quarter circles.

      The percentages in the semi-circles reflect the amount of effects (marked in red) and null effects (marked in green) per dataset, when viewed as average across the different measures of HRV. Sometimes less effects were found for some frequency ranges resulting in quarters instead of semi circles.

      Page 8: I think the authors could make it more clear that one of the conditions they were testing was the ECG component of the EEG data (extracted by ICA then projected back into the scalp space for the temporal response function analysis).

      As suggested by the reviewer we adjusted our wording and replaced the arguably a bit ambiguous “... projected back separately” with “... projected back into the sensor space”. We thank the reviewer for this recommendation, as it does indeed make it easier to understand the procedure.

      “After pre-processing (see Methods) the data was split in three conditions using an ICA(22). Independent components that were correlated (at r > 0.4; see Methods: MEG/EEG Processing - pre-processing) with the ECG electrode were either not removed from the data (Figure 3ABCD - blue), removed from the data (Figure 2ABCD - orange) or projected back into the sensor space (Figure 3ABCD - green).”

      Figure 4A: standardized beta coefficients for the relationship between age and spectral slope could be noted to provide improved clarity (if I'm correct in assuming that is what they reflect).

      This was indeed shown in Figure 4A and noted in the color bar as “average beta (standardized)”. We do not specifically highlight this in the text, because the exact coefficients would depend on both on the analyzed frequency range and the selected electrodes.

      Figure 4I: The regressions explained at this point seems to contain a very large number of potential predictors, as I'm assuming it includes all sensors for both the ECG component and ECG rejected conditions? (if that is not the case, it could be explained in greater detail). I'm also not sure about the logic of taking a complete signal, decomposing it with ICA to separate out the ECG and non-ECG signals, then including them back into the same regression model. It seems that there could be some circularity or redundancy in doing so. However, I'm not confident that this is an issue, so would appreciate the authors explaining why it this is a valid approach (if that is the case).

      After observing significant effects both in the MEG<sub>ECG component</sub> and MEG<sub>ECG rejected</sub> conditions in similar frequency bands we wanted to understand whether or not these age-related changes are statistically independent. To test this we added both variables as predictors in a regression model (thereby accounting for the influence of the other in relation to age). The regression models we performed were therefore actually not very complex. They were built using only two predictors, namely the data (in a specific frequency range) averaged over channels on which we noticed significant effects in the ECG rejected and ECG components data respectively (Wilkinson notation: age ~ 1 + ECG rejected + ECG components). This was also described in the results section stating that: “To see if MEG<sub>ECG rejected</sub> and MEG<sub>ECG component</sub> explain unique variance in aging at frequency ranges where we noticed shared effects, we averaged the spectral slope across significant channels and calculated a multiple regression model with MEG<sub>ECG component</sub> and MEG<sub>ECG rejected</sub> as predictors for age (to statistically control for the effect of MEG<sub>ECG component</sub>s and MEG<sub>ECG rejected</sub> on age). This analysis was performed to understand whether the observed shared age-related effects (MEG<sub>ECG rejected</sub> and MEG<sub>ECG component</sub>) are in(dependent).”  

      We hope this explanation solves the previous misunderstanding.

      The explanation of results for relationships between spectral slopes and aging reported in Figure 4 refers to clusters of effects, but the statistical inference methods section doesn't explain how these clusters were determined.

      The wording of “cluster” was used to describe a “category” of effects e.g. null effects. We changed the wording from “cluster” to “category” to make this clearer stating now that: “This analysis, which is depicted in Figure 4, shows that over a broad amount of individual fitting ranges and sensors, aging resulted in a steepening of spectral slopes across conditions (see Figure 4E) with “steepening effects” observed in 25% of the processing options in MEG<sub>ECG not rejected</sub> , 0.5% in MEG<sub>ECG rejected</sub>, and 60% for MEG<sub>ECG components</sub>. The second largest category of effects were “null effects” in 13% of the options for MEG<sub>ECG not rejected</sub> , 30% in MEG<sub>ECG rejected</sub>, and 7% for MEG<sub>ECG components</sub>. ”

      Page 12: can the authors clarify whether these age related steepenings of the spectral slope in the MEG are when the data include the ECG contribution, or when the data exclude the ECG? (clarifying this seems critical to the message the authors are presenting).

      We apologize for not making this clearer. We now write: “This analysis also indicates that a vast majority of observed effects irrespective of condition (ECG components, ECG not rejected, ECG rejected) show a steepening of the spectral slope with age across sensors and frequency ranges.”

      Page 13: I think it would be useful to describe how much variance was explained by the MEG-ECG rejected vs MEG-ECG component conditions for a range of these analyses, so the reader also has an understanding of how much aperiodic neural activity might be influenced by age (vs if the effects are really driven mostly by changes in the ECG).

      With regards to the explained variance I think that the very important question of how strong age influences changes in aperiodic activity is a topic better suited for a meta analysis. As the effect sizes seems to vary largely depending on the sample e.g. for EEG in the literature results were reported at r=-0.08 (Cesnaite et al. 2023), r=-0.26 (Cellier et al. 2021), r=-0.24/r=-0.28/r=-0.35 (Hill et al. 2022) and r=0.5/r=0.7 (Voytek et al. 2015). I would defer the reader/reviewer to the standardized beta coefficients as a measure of effect size in the current study that is depicted in Figure 4A.

      Cellier, D., Riddle, J., Petersen, I., & Hwang, K. (2021). The development of theta and alpha neural oscillations from ages 3 to 24 years. Developmental cognitive neuroscience, 50, 100969.

      Cesnaite, E., Steinfath, P., Idaji, M. J., Stephani, T., Kumral, D., Haufe, S., ... & Nikulin, V. V. (2023). Alterations in rhythmic and non‐rhythmic resting‐state EEG activity and their link to cognition in older age. NeuroImage, 268, 119810.

      Hill, A. T., Clark, G. M., Bigelow, F. J., Lum, J. A., & Enticott, P. G. (2022). Periodic and aperiodic neural activity displays age-dependent changes across early-to-middle childhood. Developmental Cognitive Neuroscience, 54, 101076.

      Voytek, B., Kramer, M. A., Case, J., Lepage, K. Q., Tempesta, Z. R., Knight, R. T., & Gazzaley, A. (2015). Age-related changes in 1/f neural electrophysiological noise. Journal of Neuroscience, 35(38), 13257-13265.

      Also, if there are specific M/EEG sensors where the 1/f activity does relate strongly to age, it would be worth noting these, so future research could explore those sensors in more detail.

      I think it is difficult to make a clear claim about this for MEG data, as the exact location or type of the sensor may differ across manufacturers. Such a statement could be easier made for source projected data or in case EEG electrodes were available, where the location would be normed eg. according to the 10-20 system.

      DISCUSSION:

      Page 15: Please change the wording of the following sentence, as the way it is currently worded seems to suggest that the authors of the current manuscript have demonstrated this point (which I think is not the case): "The authors demonstrate that EEG typically integrates activity over larger volumes than MEG, resulting in differently shaped spectra across both recording methods."

      Apologies for the oversight! The reviewer is correct we in fact did not show this, but the authors of the cited manuscript. We correct the sentence as suggested stating now that:

      “Bénar et al. demonstrate that EEG typically integrates activity over larger volumes than MEG, resulting in differently shaped spectra across both recording methods.”

      Page 16: The authors mention the results can be sensitive to the application of SSS to clean the MEG data, but not ICA. I think it would be sensitive to the application of either SSS or ICA?

      This is correct and actually also supported by Figure S7, as differences in ICA thresholds affect also the detection of age-related effects. We therefore adjusted the related sentences stating now that:

      “ In case of the MEG signal this may include the application of Signal-Space-Separation algorithms (SSS(24,55)), different thresholds for ICA component detection (see Figure S7), high and low pass filtering, choices during spectral density estimation (window length/type etc.), different parametrization algorithms (e.g. IRASA vs FOOOF) and selection of frequency ranges for the aperiodic slope estimation.”

      It would be worth clarifying that the linked mastoid re-reference alone has been proposed to cancel out the ECG signal, rather than that a linked-mastoid re-reference improves the performance of the ICA separation (which could be inferred by the explanation as it's currently written).

      This is correct and we adjusted the sentence accordingly! Stating now that:

      “ Previous work(12,56) has shown that a linked mastoid reference alone was particularly effective in reducing the impact of ECG related activity on aperiodic activity measured using EEG. “

      The issue of the number of EEG channels could probably just be noted as a potential limitation, as could the issue of neural activity being mixed into the ECG component (although this does pose a potential confound to the M/EEG without ECG condition, I suspect it wouldn't be critical).

      This is indeed a very fair point as a higher amount of electrodes would probably make it easier to better isolate ECG components in the EEG, which may be the reason why the separation did not work so well in our case. However, this is ultimately an empirical question so we highlighted it in the discussion section stating that: “Difficulties in removing ECG related components from EEG signals via ICA might be attributable to various reasons such as the number of available sensors or assumptions related to the non-gaussianity of the underlying sources. Further understanding of this matter is highly important given that ICA is the most widely used procedure to separate neural from peripheral physiological sources. ”

      OUTLOOK:

      Page 19: Although there has been a recent trend to control for 1/f activity when examining oscillatory power, recent research suggests that this should only be implemented in specific circumstances, otherwise the correction causes more of a confound than the issue does. It might be worth considering this point with regards to the final recommendation in the Outlook section: Brake, N., Duc, F., Rokos, A., Arseneau, F., Shahiri, S., Khadra, A., & Plourde, G. (2024). A neurophysiological basis for aperiodic EEG and the background spectral trend. Nature Communications, 15(1), 1514.

      We want to thank the reviewer for recommending this very interesting paper! The authors of said paper present compelling evidence showing that, while peak detection above an aperiodic trend using methods like FOOOF or IRASA is a prerequisite to determine the presence of oscillatory activity, it’s not necessarily straightforward to determine which detrending approach should be applied to determine the actual power of an oscillation. Furthermore, the authors suggest that wrongfully detrending may cause larger errors than not detrending at all. We therefore added a sentence stating that: “However, whether or not periodic activity (after detection) should be detrended using approaches like FOOOF or IRASA still remains disputed, as incorrectly detrending the data may cause larger errors than not detrending at all(75).”

      RECOMMENDATIONS:

      Page 20: "measure and account for" seems like it's missing a word, can this be re-written so the meaning is more clear?

      Done as suggested. The sentence now states: “To better disentangle physiological and neural sources of aperiodic activity, we propose the following steps to (1) measure and (2) account for physiological influences.”

      I would re-phrase "doing an ICA" to "reducing cardiac artifacts using ICA" (this wording could be changed in other places also).

      I do not like to describe cardiac or ocular activity as artifactual per se. This is also why I used hyphens whenever I mention the word “artifact” in association with the ECG or EOG. However, I do understand that the wording of “doing an ICA” is a bit sloppy. We therefore reworded it accordingly throughout the manuscript to e.g. “separating cardiac from neural sources using an ICA” and “separating physiological from neural sources using an ICA”.

      I would additionally note that even if components are identified as unambiguously cardiac, it is still likely that neural activity is mixed in, and so either subtracting or leaving the component will both be an issue (https://doi.org/10.1101/2024.06.06.597688). As such, even perfect identification of whether components are cardiac or not would still mean the issue remains (and this issue is also consistent across a considerable range of component based methods). Furthermore, current methods including wavelet transforms on the ICA component still do not provide good separation of the artifact and neural activity.

      This is definitely a fair point and we also highlight this in our recommendations under 3 stating that:

      “However, separating physiological from neural sources using an ICA is no guarantee that peripheral physiological activity is fully removed from the cortical signal. Even more sophisticated ICA based methods that e.g. apply wavelet transforms on the ICA components may still not provide a good separation of peripheral physiological and neural activity76,77. This turns the process of deciding whether or not an ICA component is e.g. either reflective of cardiac or neural activity into a challenging problem. For instance, when we only extract cardiac components using relatively high detection thresholds (e.g. r > 0.8), we might end up misclassifying residual cardiac activity as neural. In turn, we can’t always be sure that using lower thresholds won’t result in misinterpreting parts of the neural effects as cardiac. Both ways of analyzing the data can potentially result in misconceptions.”

      Castellanos, N. P., & Makarov, V. A. (2006). Recovering EEG brain signals: Artifact suppression with wavelet enhanced independent component analysis. Journal of neuroscience methods, 158(2), 300-312.

      Bailey, N. W., Hill, A. T., Godfrey, K., Perera, M. P. N., Rogasch, N. C., Fitzgibbon, B. M., & Fitzgerald, P. B. (2024). EEG is better when cleaning effectively targets artifacts. bioRxiv, 2024-06.

      METHODS:

      Pre-processing, page 24: I assume the symmetric setting of fastica was used (rather than the deflation setting), but this should be specified.

      Indeed the reviewer is correct, we used the standard setting of fastICA implemented in MNE python, which is calling the FastICA implementation in sklearn that is per default using the “parallel” or symmetric algorithm to compute an ICA. We added this information to the text accordingly, stating that:

      “For extracting physiological “artifacts” from the data, 50 independent components were calculated using the fastica algorithm(22) (implemented in MNE-Python version 1.2; with the parallel/symmetric setting; note: 50 components were selected for MEG for computational reasons for the analysis of EEG data no threshold was applied).”

      Temporal response functions, page 26: can the authors please clarify whether the TRF is computed against the ECG signal for each electrode or sensory independently, or if all electrodes/sensors are included in the analysis concurrently? I'm assuming it was computed for each electrode and sensory separately, since the TRF was computed in both the forward and backwards direction (perhaps the meaning of forwards and backwards could be explained in more detail also - i.e. using the ECG to predict the EEG signal, or using the EEG signal to predict the ECG signal?).

      A TRF can also be conceptualized as a multiple regression model over time lags. This means that we used all channels to compute the forward and backward models. In the case of the forward model we predicted the signal of the M/EEG channels in a multivariate regression model using the ECG electrode as predictor. In case of the backward model we predicted the ECG electrode based on the signal of all M/EEG channels. The forward model was used to depict the time window at which the ECG signal was encoded in the M/EEG recording, which appears at 0 time lags indicating volume conduction. The backward model was used to see how much information of the ECG was decodable by taking the information of all channels.

      We tried to further clarify this approach in the methods section stating that:

      “We calculated the same model in the forward direction (encoding model; i.e. predicting M/EEG data in a multivariate model from the ECG signal) and backward direction (decoding model; i.e. predicting the ECG signal using all M/EEG channels as predictors).”

      Page 27: the ECG data was fit using a knee, but it seems the EEG and MEG data was not.

      Does this different pose any potential confound to the conclusions drawn? (having said this, Figure S4 suggests perhaps a knee was tested in the M/EEG data, which should perhaps be explained in the text also).

      This was indeed tested in a previous review round to ensure that our results are not dependent on the presence/absence of a knee in the data. We therefore added figure S4, but forgot to actually add a description in the text. We are sorry for this oversight and added a paragraph to S1 accordingly:

      “Using FOOOF(5), we also investigated the impact of different slope fitting options (fixed vs. knee model fits) on the aperiodic age relationship (see Supplementary Figure S4). The results that we obtained from these analyses using FOOOF offer converging evidence with our main analysis using IRASA.”

      Page 32: my understanding of the result reported here is that cleaning with ICA provided better sensitivity to the effects of age on 1/f activity than cleaning with SSS. Is this accurate? I think this could also be reported in the main manuscript, as it will be useful to researchers considering how to clean their M/EEG data prior to analyzing 1/f activity.

      The reviewer is correct in stating that we overall detected slightly more “significant” effects, when not additionally cleaning the data using SSS. However, I am a bit wary of recommending omitting the use of SSS maxfilter solely based on this information. It can very well be that the higher quantity of effects (when not employing SSS maxfilter) stems from other physiological sources (e.g. muscle activity) that are correlated with age and removed when applying SSS maxfiltering. I think that just conditioning the decision of whether or not maxfilter is applied based on the amount or size of effects may not be the best idea. Instead I think that the applicability of maxfilter for research questions related to aperiodic activity should be the topic of additional methodological research. We therefore now write in Text S1:

      “Considering that we detected less and weaker aperiodic effects when using SSS maxfilter is it advisable to omit maxfilter, when analyzing aperiodic signals? We don’t think that we can make such a judgment based on our current results. This is because it's unclear whether or not the reduction of effects stems from an additional removal of peripheral information (e.g. muscle activity; that may be correlated with aging) or is induced by the SSS maxfiltering procedure itself. As the use of maxfilter in detecting changes of aperiodic activity was not subject of analysis that we are aware of, we suggest that this should be the topic of additional methodological research.”

      Page 39, Figure S6 and Figure S8: Perhaps the caption could also briefly explain the difference between maxfilter set to false vs true? I might have missed it, but I didn't gain an understanding of what varying maxfilter would mean.

      Figure S6 shows the effect of ageing on the spectral slope averaged across all channels. The maxfilter set to false in AB) means that no maxfiltering using SSS was performed vs. in CD) where the data was additionally processed using the SSS maxfilter algorithm. We now describe this more clearly by writing in the caption:

      “Supplementary Figure S6: Age-related changes in aperiodic brain activity are most prominent on explained by cardiac components irrespective of maxfiltering the data using signal space separation (SSS) or not AC) Age was used to predict the spectral slope (fitted at 0.1-145Hz) averaged across sensors at rest in three different conditions (ECG components not rejected [blue], ECG components rejected [orange], ECG components only [green].”

    1. eLife Assessment

      This comprehensive study presents important findings that delineate how specific dopaminergic neurons (DANs) instruct aversive learning in Drosophila larvae exposed to high salt through an integration of behavioral experiments, imaging, and connectomic analysis. The work reveals how a numerically minimal circuit achieves remarkable functional complexity, with redundancies and synergies within the DL1 cluster that challenge our understanding of how few neurons generate learning behaviors. By establishing a framework for sensory-driven learning pathways, the study makes a compelling and substantial contribution to understanding associative conditioning while demonstrating conservation of learning mechanisms across Drosophila developmental stages.

    2. Reviewer #1 (Public review):

      Summary:

      In this paper Weber et al. investigate the role of 4 dopaminergic neurons of the Drosophila larva in mediating the association between an aversive high-salt stimulus and a neutral odor. The 4 DANs belong to the DL1 cluster and innervate non-overlapping compartments of the mushroom body, distinct from those involved in appetitive associative learning. Using specific driver lines for individual neurons, the authors show that activation of the DAN-g1 is sufficient to mimic an aversive memory and it is also necessary to form a high-salt memory of full strength, although optogenetic silencing of this neuron has only a partial phenotype. The authors use calcium imaging to show that the DAN-g1 is not the only DAN responding to salt. DAN-c1 and d1 also respond to salt, but they seem to play no role for the associative memory. DAN-f1, which does not respond to salt, is able to lead to the formation of a memory (if optogenetically activated), but it is not necessary for the salt-odor memory formation in normal conditions. However, when silenced together with DAN-g1, it enhances the memory deficit of DAN-g1. Overall, this work brings evidence of a complex interaction between DL1 DANs in both the encoding of salt signals and their teaching role in associative learning, with none of them being individually necessary and sufficient for both functions.

      Strengths:

      Overall, the manuscript contributes interesting results that are useful to understand the organization and function of the dopaminergic system. The behavioral role of the specific DANs is accessed using specific driver lines which allow to test their function individually and in pairs. Moreover, the authors perform calcium imaging to test whether DANs are activated by salt, a prerequisite for inducing a negative association to it. Proper genetic controls are carried across the manuscript.

      Weaknesses:

      The authors use two different approaches to silence dopaminergic neurons: optogenetics and induction of apoptosis. The results are not always consistent, but the authors discuss these differences appropriately. In general, the optogenetic approach is more appropriate as developmental compensations are not of major interest for the question investigated.

      The physiological data would suggest the role of a certain subset of DANs in salt-odor association, but a different partially overlapping set is necessary in behavioral assays (with a partial phenotype). No manipulation completely abolishes the salt-odor association, leaving important open questions on the identity of the neural circuits involved in this behavior.

      The EM data analysis reveals a non-trivial organization of sensory inputs into DANs, but it is difficult to extrapolate a link to the functional data presented in the paper.

    3. Reviewer #2 (Public review):

      Summary:

      In this work the authors show that dopaminergic neurons (DANs) from the DL1 cluster in Drosophila larvae are required for the formation of aversive memories. DL1 DANs complement pPAM cluster neurons which are required for the formation of attractive memories. This shows the compartmentalized network organization of how an insect learning center (the mushroom body) encodes memory by integrating olfactory stimuli with aversive or attractive teaching signals. Interestingly, the authors found that the 4 main dopaminergic DL1 neurons act partially redundant, and that single cell ablation did not result in aversive memory defects. However, ablation or silencing of a specific DL1 subset (DAN-f1,g1) resulted in reduced salt aversion learning, which was specific to salt but no other aversive teaching stimuli tested. Importantly, activation of these DANs using an optogenetic approach was also sufficient to induce aversive learning in the presence of high salt. Together with the functional imaging of salt and fructose responses of the individual DANs and the implemented connectome analysis of sensory (and other) inputs to DL1/pPAM DANs this represents a very comprehensive study linking the structural, functional and behavioral role of DL1 DANs. This provides fundamental insight into the function of a simple yet efficiently organized learning center which displays highly conserved features of integrating teaching signals with other sensory cues via dopaminergic signaling.

      Strengths:

      This is a very careful, precise and meticulous study identifying the main larval DANs involved in aversive learning using high salt as a teaching signal. This is highly interesting because it allows to define the cellular substrates and pathways of aversive learning down to the single cell level in a system without much redundancy. It therefore sets the basis to conduct even more sophisticated experiments and together with the neat connectome analysis opens the possibility to unravel different sensory processing pathways within the DL1 cluster and integration with the higher order circuit elements (Kenyon cells and MBONs). The authors' claims are well substantiated by the data and balanced, putting their data in the appropriate context. The authors also implemented neat pathway analyses using the larval connectome data to its full advantage, thus providing network pathways that contribute towards explaining the obtained results.

      Weaknesses:

      Previous comments were fully addressed by the authors.

    4. Reviewer #3 (Public review):

      The study of Weber et al. provides a thorough investigation of the roles of four individual dopamine neurons for aversive associative learning in the Drosophila larva. They focus on the neurons of the DL-1 cluster which already have been shown to signal aversive teaching signals. But the authors go beyond the previous publications and test whether each of these dopamine neurons responds to salt or sugar, is necessary for learning about salt, bitter, or sugar, and is sufficient to induce a memory when optogenetically activated. In addition, previously published connectomic data is used to analyze the synaptic input to each of these dopamine neurons. The authors conclude that the aversive teaching signal induced by salt is distributed across the four DL-1 dopamine neurons, with two of them, DAN-f1 and DAN-g1, being particularly important. Overall, the experiments are well designed and performed, support the authors' conclusions, and deepen our understanding of the dopaminergic punishment system.

      Strengths:

      (1) This study provides, at least to my knowledge, the first in vivo imaging of larval dopamine neurons in response to tastants. Although the selection of tastants is limited, the results close an important gap in our understanding of the function of these neurons.<br /> (2) The authors performed a large number of experiments to probe for the necessity of each individual dopamine neuron, as well as combinations of neurons, for associative learning. This includes two different training regimen (1 or 3 trials), three different tastants (salt, quinine and fructose) and two different effectors, one ablating the neuron, the other one acutely silencing it. This thorough work is highly commendable, and the results prove that it was worth it. The authors find that only one neuron, DAN-g1, is partially necessary for salt learning when acutely silenced, whereas a combination of two neurons, DAN-f1 and DAN-g1, are necessary for salt learning when either being ablated or silenced.<br /> (3) In addition, the authors probe whether any of the DL-1 neurons is sufficient for inducing an aversive memory. They found this to be the case for two of the neurons, largely confirming previous results obtained by a different learning paradigm, parameters and effector.<br /> (4) This study also takes into account connectomic data to analyze the sensory input that each of the dopamine neurons receives. This analysis provides a welcome addition to previous studies and helps to gain a more complete understanding. The authors find large differences in inputs that each neuron receives, and little overlap in input that the dopamine neurons of the "aversive" DL-1 cluster and the "appetitive" pPAM cluster seem to receive.<br /> (5) Finally, the authors try to link all the gathered information in order to describe an updated working model of how aversive teaching signals are carried by dopamine neurons to the larva's memory center. This includes important comparisons both between two different aversive stimuli (salt and nociception) and between the larval and adult stages.

    5. Author response:

      The following is the authors’ response to the original reviews

      Public reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this paper, Weber et al. investigate the role of 4 dopaminergic neurons of the Drosophila larva in mediating the association between an aversive high-salt stimulus and a neutral odor. The 4 DANs belong to the DL1 cluster and innervate non-overlapping compartments of the mushroom body, distinct from those involved in appetitive associative learning. Using specific driver lines, they show that activation of the DAN-g1 is sufficient to mimic an aversive memory and it is also necessary to form a high-salt memory of full strength, although optogenetic silencing of this neuron only partially affects the performance index. The authors use calcium imaging to show that the DAN-g1 is not the only one that responds to salt. DAN-c1 and d1 also respond to salt, but they seem to play no role in the assays tested. DAN-f1, which does not respond to salt, is able to lead to the formation of memory (if optogenetically activated), but it is not necessary for the salt-odor memory formation in normal conditions. However, silencing of DAN-f1 together with DAN-g1, enhances the memory deficit of DAN-g1.

      Strengths:

      The paper therefore reveals that also in the Drosophila larva as in the adult, rewards and punishments are processed by exclusive sets of DANs and that a complex interaction between a subset of DANs mediates salt-odor association.

      Overall, the manuscript contributes valuable results that are useful for understanding the organization and function of the dopaminergic system. The behavioral role of the specific DANs is accessed using specific driver lines which allow for testing of their function individually and in pairs. Moreover, the authors perform calcium imaging to test whether DANs are activated by salt, a prerequisite for inducing a negative association with it. Proper genetic controls are carried across the manuscript.

      Weaknesses:

      The authors use two different approaches to silence dopaminergic neurons: optogenetics and induction of apoptosis. The results are not always consistent, and the authors could improve the presentation and interpretation of the data. Specifically, optogenetics seems a better approach than apoptosis, which can affect the overall development of the system, but apoptosis experiments are used to set the grounds of the paper.

      The physiological data would suggest the role of a certain subset of DANs in salt-odor association, but a different partially overlapping set seems to be necessary. This should be better discussed and integrated into the author's conclusion. The EM data analysis reveals a non-trivial organization of sensory inputs into DANs and it is hard to extrapolate a link to the functional data presented in the paper.

      We would like to thank reviewer 1 for the positive evaluation of our work and for the critical suggestions for improvement. In the new version of the manuscript, we have centralized the optogenetic results and moved some of the ablation experiments to the Supplement. We also discuss in detail the experimental differences in the results. In addition, we have softened our interpretation of the specificity of memory for salt. As a result, we now emphasize more the general role of DANs for aversive learning in the larva. These changes are now also summarized and explained more simply and clearly in the Discussion, along with a revised discussion of the EM data.

      Reviewer #2 (Public Review):

      Summary:

      In this work, the authors show that dopaminergic neurons (DANs) from the DL1 cluster in Drosophila larvae are required for the formation of aversive memories. DL1 DANs complement pPAM cluster neurons which are required for the formation of attractive memories. This shows the compartmentalized network organization of how an insect learning center (the mushroom body) encodes memory by integrating olfactory stimuli with aversive or attractive teaching signals. Interestingly, the authors found that the 4 main dopaminergic DL1 neurons act redundantly, and that single-cell ablation did not result in aversive memory defects. However, ablation or silencing of a specific DL1 subset (DAN-f1,g1) resulted in reduced salt aversion learning, which was specific to salt but no other aversive teaching stimuli were tested. Importantly, activation of these DANs using an optogenetic approach was also sufficient to induce aversive learning in the presence of high salt. Together with the functional imaging of salt and fructose responses of the individual DANs and the implemented connectome analysis of sensory (and other) inputs to DL1/pPAM DANs, this represents a very comprehensive study linking the structural, functional, and behavioral role of DL1 DANs. This provides fundamental insight into the function of a simple yet efficiently organized learning center which displays highly conserved features of integrating teaching signals with other sensory cues via dopaminergic signaling.

      Strengths:

      This is a very careful, precise, and meticulous study identifying the main larval DANs involved in aversive learning using high salt as a teaching signal. This is highly interesting because it allows us to define the cellular substrates and pathways of aversive learning down to the single-cell level in a system without much redundancy. It therefore sets the basis to conduct even more sophisticated experiments and together with the neat connectome analysis opens the possibility of unraveling different sensory processing pathways within the DL1 cluster and integration with the higher-order circuit elements (Kenyon cells and MBONs). The authors' claims are well substantiated by the data and clearly discussed in the appropriate context. The authors also implement neat pathway analyses using the larval connectome data to its full advantage, thus providing network pathways that contribute towards explaining the obtained results.

      Weaknesses:

      While there is certainly room for further analysis in the future, the study is very complete as it stands. Suggestions for clarification are minor in nature.

      We would like to thank reviewer 2 for the positive evaluation of our work. In fact, follow-up work is already underway to further analyze the role of the individual DL1 DANs. We have addressed the constructive and detailed suggestions for improvement in our point-by-point responses in the “Recommendations for the authors” section.

      Reviewer #3 (Public Review):

      The study of Weber et al. provides a thorough investigation of the roles of four individual dopamine neurons for aversive associative learning in the Drosophila larva. They focus on the neurons of the DL-1 cluster which already have been shown to signal aversive teaching signals. However, the authors go far beyond the previous publications and test whether each of these dopamine neurons responds to salt or sugar, is necessary for learning about salt, bitter, or sugar, and is sufficient to induce a memory when optogenetically activated. In addition, previously published connectomic data is used to analyze the synaptic input to each of these dopamine neurons. The authors conclude that the aversive teaching signal induced by salt is distributed across the four DL-1 dopamine neurons, with two of them, DAN-f1 and DAN-g1, being particularly important. Overall, the experiments are well designed and performed, support the authors' conclusions, and deepen our understanding of the dopaminergic punishment system.

      Strengths:

      (1) This study provides, at least to my knowledge, the first in vivo imaging of larval dopamine neurons in response to tastants. Although the selection of tastants is limited, the results close an important gap in our understanding of the function of these neurons.

      (2) The authors performed a large number of experiments to probe for the necessity of each individual dopamine neuron, as well as combinations of neurons, for associative learning. This includes two different training regimens (1 or 3 trials), three different tastants (salt, quinine, and fructose) and two different effectors, one ablating the neuron, the other one acutely silencing it. This thorough work is highly commendable, and the results prove that it was worth it. The authors find that only one neuron, DAN-g1, is partially necessary for salt learning when acutely silenced, whereas a combination of two neurons, DAN-f1 and DAN-g1, are necessary for salt learning when either being ablated or silenced.

      (3) In addition, the authors probe whether any of the DL-1 neurons is sufficient for inducing an aversive memory. They found this to be the case for three of the neurons, largely confirming previous results obtained by a different learning paradigm, parameters, and effector.

      (4) This study also takes into account connectomic data to analyze the sensory input that each of the dopamine neurons receives. This analysis provides a welcome addition to previous studies and helps to gain a more complete understanding. The authors find large differences in inputs that each neuron receives, and little overlap in input that the dopamine neurons of the "aversive" DL-1 cluster and the "appetitive" pPAM cluster seem to receive.

      (5) Finally, the authors try to link all the gathered information in order to describe an updated working model of how aversive teaching signals are carried by dopamine neurons to the larva's memory center. This includes important comparisons both between two different aversive stimuli (salt and nociception) and between the larval and adult stages.

      Weaknesses:

      (1) The authors repeatedly claim that they found/proved salt-specific memories. I think this is problematic to some extent.

      (1a) With respect to the necessity of the DL-1 neurons for aversive memories, the authors' notion of salt-specificity relies on a significant reduction in salt memory after ablating DAN-f1 and g1, and the lack of such a reduction in quinine memory. However, Fig. 5K shows a quite suspicious trend of an impaired quinine memory which might have been significant with a higher sample size. I therefore think it is not fully clear yet whether DAN-f1 and DAN-g1 are really specifically necessary for salt learning, and the conclusions should be phrased carefully.

      (1b) With respect to the results of the optogenetic activation of DL-1 neurons, the authors conclude that specific salt memories were established because the aversive memories were observed in the presence of salt. However, this does not prove that the established memory is specific to salt - it could be an unspecific aversive memory that potentially could be observed in the presence of any other aversive stimuli. In the case of DAN-f1, the authors show that the neuron does not even get activated by salt, but is inhibited by sugar. Why should activation of such a neuron establish a specific salt memory? At the current state, the authors clearly showed that optogenetic activation of the neurons does induce aversive memories - the "content" of those memories, however, remains unknown.

      (2) In many figures (e.g. figures 4, 5, 6, supplementary figures S2, S3, S5), the same behavioural data of the effector control is plotted in several sub-figures. Were these experiments done in parallel? If not, the data should not be presented together with results not gathered in parallel. If yes, this should be clearly stated in the figure legends.

      We would also like to thank reviewer 3 for his positive assessment of our work. As already mentioned by reviewer 1, we understand the criticism that the salt specificity for which the individual DANs are coded is not fully always supported by the results of the work. We have therefore rewritten the relevant passages, which are also cited by the reviewer. We have also included the second point of criticism and incorporated it into our manuscript. As the control groups were always measured in parallel with the experimental animals, we can also present the data together in a sub-figure. We clearly state this now in the revised figure legends.

      Summary of recommendations to authors:

      Overall, the study is commendable for its systematic approach and solid methodology. Several weaknesses were identified, prompting the need for careful revisions of the manuscript:

      We thank the reviewers for the careful revision of our manuscript. In the subsequent sections, we aim to address their concerns as thoroughly as possible. A comprehensive one-to-one listing can be found below.

      (1) The authors should reconsider their assertion of uncovering a salt-specific memory, as the evidence does not conclusively demonstrate the exclusive necessity of DAN-f1 and DAN-g1 for salt learning. In particular, the optogenetic activation of DAN-f1 leads to plasticity but this might not be salt-specific. The precise nature of the memory content remains elusive, warranting a nuanced rephrasing of the conclusions.

      We only partially agree – optogenetic activation of DANs does not really allow to comment on its salt-specificity, true. However, we used high-salt concentrations during test. Over the years, the Gerber lab nicely demonstrated in several papers that larvae recall an aversive odor-salt memory only if salt is present during test (Gerber and Hendel, 2006; Niewalda et al 2008; Schleyer et al. 2011; Schleyer et al. 2015). The used US has to be present during test. Even at the same concentration other aversive stimuli (e.g. bitter quinine) are not able to allow the larvae to recall this particular type of memory. So, if the optogenetic activation of DAN-f1 establishes a memory that can be recalled on salt, we argue that it has to encode aspects of the salt information. On the other hand, only for DAN-g1 we see the necessity for salt learning. And – although (based on the current literature) very unlikely, we cannot fully exclude that the activation of DAN-f1 establishes a yet unknown type of memory that can be also recalled on a salt plate. Therefore, we partially agree and accordingly have rephrased the entire manuscript to avoid an over-interpretation of our data. Throughout the manuscript we avoid now to use the term salt-specific memory but rather describe the type of memory as aversive memory.

      (2) A thorough examination or discussion about the potential influence of blue light aversion on behavioral observations is necessary to ensure a balanced interpretation of the findings.

      To address this point every single behavioral experiment that uses optogenetic blue light activation runs with appropriate and mandatory controls. For blue light activation experiments, two genetic controls are used that either get the same blue light treatment (effector control, w1118>UAS-ChR2XXL) or no blue light treatment (dark control, XY-split-Gal4>UAS-ChR2XXL). For blue light inactivation experiments one group is added that has exactly the same genotype but did not receive food containing retinal. These experiments show that blue light exposure itself does not induce an aversive nor positive memory and blue light exposure does not impair the establishment of odor-high salt memory. In addition, we used the latest established transgenes available. ChR2<sup>XXL</sup> is very sensitive to blue light. Only 220 lux (60 µW/cm<sup>²</sup>) were necessary to obtain stable results. In our hands – short term exposure for up to 5 minutes with such low intensities does not induce a blue light aversion. Following the advice of the reviewer, we also address this concern by adding several sentences into the related results and methods sections.

      (3) The authors should address the limitations associated with the use of rpr/hid for neuronal ablations, such as the effects of potential developmental compensation.

      We agree with this concern. It is well possible that the ablation experiments induce compensatory effects during larval development. Such an effect may be the reason for differences in phenotypes when comparing hid,rpr ablation with optogenetic inhibition. This is now part of the discussion. In addition, we evaluated if the ablation worked in our experiments. So far controls were missing that show that the expression of hid,rpr really leads to the ablation of DANs. We now added these experiments and clearly show anatomically that the DANs are ablated (related to figure 4-figure supplement 6).

      (4) While the connectome analysis offers valuable insights into the observed functions of specific DANs in relation to their extrinsic (sensory) and intrinsic (state) inputs, integrating this data more cohesively within the manuscript through careful rewriting would enhance the coherence of the study.

      We understand this concern. Therefore, the new version of our manuscript is now intensifying the inclusion of the EM data in our interpretation of the results. Throughout the entire manuscript we have now rewritten the related parts. We have also completely revised the corresponding section in the results chapter.

      (5) More generally, the authors are encouraged to discuss internal discrepancies in the results of their functional manipulation experiments.

      Thank you for this suggestion. We do of course understand that we have not given the different results enough space in the discussion. We have now changed this and have been happy to comprehensively address the concern. 

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Here are some suggestions for clarification and improvement of the manuscript:

      (1) The authors should discuss why the silencing experiment with TH-GAL4 (Fig. 1) does not abolish memory formation (I assume that the PI should go to zero). Does it mean that other non-TH neurons are involved in salt-odor memory formation? Are there other lines that completely abolish this type of learning?

      Thank you very much for highlighting this crucial point. Indeed, the functional intervention does not completely eliminate the memory. There could be several reasons, or a combination thereof, for this outcome. For instance, it's plausible that the UAS-GtACR2 effector doesn't entirely suppress the activity of dopaminergic neurons. Additionally, the memory may comprise different types, not all of which are linked to dopamine function. It's also noteworthy that TH-Gal4 doesn't encompass all dopaminergic neurons – even a neuron from the DL1 cluster is absent (as previously reported in Selcho et al., 2009). Considering we're utilizing high salt concentrations in this experiment, it's conceivable that non gustatory-driven memories are formed based solely on the systemic effects of salt (e.g., increased osmotic pressure). These possibilities are now acknowledged in the text.

      (2) The Rpr experiments in Fig. 4 do not lead to any phenotype and there is a general assumption that the system compensates during development. However, there is no demonstration that Rpr worked or that development compensated for that. What do we learn from these data? Would it make sense to move it to supplement to make the story more compact? In addition: the conclusion at L 236 "DL1.... Are not individually necessary" is later disproved by optogenetic silencing. Similarly, optogenetic silencing of f1+g1 is affecting 1X and 3X learning, but not when using Rpr. Moreover, Rpr wdid not give any phenotype in other data in the supplementary material. I'm not sure how valid these results are.

      We acknowledge this concern and have actively deliberated various options for restructuring the presented ablation data. Ultimately, we reached a consensus that relocating Figure 4 to the supplement is warranted. Furthermore, corresponding adjustments have been made in the text. This decision amplifies the significance of the optogenetic results. In addition, we also addressed the other part of the concern. We examined the efficacy of hid and rpr in our experiments. Indeed, we successfully ablated specific DANs, as illustrated in the new anatomical data presented in Figure 4- figure supplement 6, which strengthens the interpretation of the hid,rpr experiments.

      (3) In most figures that show data for 1X and 3X training, there is no difference between these two conditions (I would suggest moving one set as a supplement). When a difference appears (Fig.5A-D) the implications are not discussed properly. Is it known that some circuits are necessary for the 1X but not for the 3X protocol? Is that a reasonable finding? I would expect the opposite, but I might lack of knowledge here. However, the optogenetic silencing of the same neurons in Figure 7 shows the same phenotype for 1X and 3X. Again, the validity of the Rpr experiments seems debatable.

      Different training protocols lead to different memory phases (STM and STM+ARM). We have shown that in the past in Widmann et al. 2016. Therefore, we are convinced that it makes sense to keep both data sets in the main manuscript. However, we agree that this was not properly introduced and discussed and therefore made the respective changes in the manuscript.

      (4) In Figure 3, it is unclear what the responses were tested against. Since they are so small and noisy there would be a need for a control. Moreover, in some cases, it looks like the DF/F is normalized to the wrong value: e.g. in DAN-c1 100mM, the activity in 0-10s is always above zero, and in pPAM with fructose is always below zero. This might not have any consequence on the results but should be adjusted.

      Thank you very much for your criticism, which we greatly appreciate. We have carefully re-examined the data and found that there was a mistake for the normalization of the values. We made the necessary adjustments to the evaluation, as per your suggestions. The updated figures, figure legends, and results have been incorporated into the new version of the manuscript. As noted by the reviewer, these corrections have not altered the interpretation of the data or the primary responses of the various DANs.

      (5) In the abstract: "Optogenetic activation of DAN-f1 and DAN-g1 alone suffices to substitute for salt punishment... Each DAN encodes a different aspect of salt punishment". These sentences might be misleading and an overstatement: only DAN-g1 shows a clear role, while the function of the other DANs in the context of salt-odor learning remains obscure.

      We have refined the respective part of the abstract accordingly. Consequently, we have reworded the related section, aiming to avoid any exaggeration.

      (6) The physiology is done in L1 larvae but behavior is tested in L3 larvae. There could be a change in this time that could explain the salt responses in c1 and d1 but no role in salt-odor learning?

      While we cannot dismiss the possibility of a developmental change from L1 to L3, a comparison of the anatomical data of the DL1 DANs from electron microscopy (EM) and light microscopy (LM) data indicates that their overall morphology remains consistent. However, it's important to note that this observation does not analyse the physiological aspects of these cells. Consequently, we have incorporated this concern into the discussion of the revised version of the manuscript.

      (7) The introduction needs some editing starting at L 129, as it ends with a discussion of a previously published EM data analysis. I would rather suggest stating which questions are addressed in this paper and which methods will be used and perhaps a hint on the results obtained.

      We understand the concern. We have added a concise paragraph to the conclusion of the introduction, highlighting the biological question, technical details, and a short hint on the acquired findings.

      (8) It is clear to me that the presentation of salt during the test is necessary for recall, however in L 166 I don't understand the explanation: how is the memory used in a beneficial way in the test? The salt is present everywhere and the odor cue is actually useless to escape it.

      Extensive research, exemplified by studies such as Schleyer et al. (2015) published in Elife, clearly demonstrates that the recall of odor-high salt memory occurs exclusively when tested on a high salt plate. Even when tested on a bitter quinine plate, the aversive memory is not recalled. This phenomenon is attributed to the triggering of motivation to recall the memory by the omnipresent abundance of the unconditioned stimulus (US) during the test, which in our case is high salt. Furthermore, the concentration of the stimulus plays a crucial role (Schleyer et al. 2011). The odor cue indicates where the situation could potentially be improved; however, if high salt is absent, this motivational drive diminishes as there is no memory present to enhance the already favorable situation. Additionally, the motivation to evade the omnipresent and unpleasant high salt stimulus persists throughout the entire 5-minute test period.

      (9) L288: the fact that f1 shows a phenotype in this experiment does not mean that it encodes a salt signal, indeed it does not respond to salt. It perhaps induces a plasticity that can be recalled by salt, but not necessarily linked to salt. The synergy between f1 and g1 in the salt assay was postulated based on exp with Rpr, but the validity of these experiments is dubious. I'm not sure there is sufficient evidence from Figures 6 and 7 to support a synergistic action between f1 and g1.

      It is true that DAN-f1 alone is not necessary for mediating a high salt teaching signal based on ablation, optogenetic inhibition and even physiology. However, optogenetic activation alone shows a memory tested on a salt plate. Given the logic explained above that is accepted by several publications, we would like to keep the statement. Especially as the joined activation with DAN-g1 gives rise to significant higher or lower values after joined optogenetic activation or inactivation (Figure 5E and F, Figure 6E and F in the new version). Nevertheless, we have modified the sentence. In the text we describe these effects now as “these results may suggest that DAN-f1 and DAN-g1 encode aspects of the natural aversive high salt teaching signal under the conditions that we tested”. We think that this is an appropriate and three-fold restricted statement. Therefore, we would like to keep it in this restricted version. However, we are happy to reconsider this if the reviewer thinks it is critical. 

      (10) I find the EM analysis hard to read. First of all, because of the two different graphical representations used in Fig. 8, wouldn't one be sufficient to make the point? Secondly, I could not grasp a take-home-message: what do we learn from the EM data? Do they explain any of the results? It seems to me that they don't provide an explanation of why some DL1 neurons respond to salt and others don't.

      We understand that the EM analysis is hard to read and have now carefully rewritten this part of the manuscript. See also general concern 4 above. The main take home message is not to explain why some DL1 neurons respond to salt and other do not. This cannot be resolved due to the missing information on the salt perceiving receptor cells. Unfortunately, we miss the peripheral nervous system in the EM - the first layer of salt information processing. However, our analysis shows clearly that the 4 DANs have their own identity based on their connectivity. None of them is the same – but to a certain extent similarities exist. This nicely reflects the physiological and behavioral results. We have now clarified that in the result to ease the understanding for the readership. In addition, we also clearly state that we don’t address the point why some DL1 neurons respond to salt and why others don’t respond.

      (11) Do the manipulations (activation and silencing) affect odor preference in the presence of salt? Did the authors test that the two odors do not drive different behaviors on the salty plate? Or did they only test the odor preference on plain agarose? Can we exclude a role for the DAN in driving multisensory-driven innate behavior?

      Innate odor preferences are not changed by the presence of salt or even other tastants (this work but see also Schleyer et al 2015, Figure 3, Elife). Even the naïve choice between two odors is the same if tested in the presence of different tastants (Schleyer et al 2015, Figure 3, Elife). This shows – at least for the tested stimuli and conditions – that are similar to the ones that we use – that there is no multisensory-driven innate odor-taste behavior. Therefore – at least to our knowledge - experiments as the ones suggested by the reviewer were never done in larval odor-taste learning studies. Therefore, we suggest that DAN activation has no effect on innate larval behavior. However, we are happy to reconsider this if the reviewer thinks it is critical. 

      (12) L 280: the authors generalize the conclusion to all DL1-DANs, but it does not apply to c1 and d1.

      Thanks for this comment. We deleted that sentence as suggested and thus do not anymore generalize the conclusion to all DL-DANs.

      (13) L345: I do not see the described differences in Fig. 8F, presynaptic sites of both types seem to appear in rather broad regions: could the author try to clarify this?

      We understand that the anatomical description of the data is often hard to read. Especially to readers that are not used to these kind of figures. We have therefore modified the text to ease the understanding and clarify the difference in the labeled brain regions for the broad readership.

      (14) L373: the conclusion on c1 is unsupported by data: this neuron responds to both salt and fructose (Figure 3 ) while the conclusion is purely based on EM data analysis.

      The sentence is not a conclusion but a speculation and we also list the cell's response to positive and negative gustatory stimuli. Therefore, we do not understand exactly what the reviewer means here. However, we have tried to address the criticism and have revised the sentences.

      (15) L385: the data on d1 seem to be inconsistent with Eschbach 2020, but the authors do not discuss if this is due to the differential vs absolute training, or perhaps the presence of the US during the test (which does not seem to be there in Eschbach, 2020) - is the training protocol really responsible for this inconsistency? For f1 the data seem to be consistent across these studies. The authors should clarify how the exp in Fig 6 differs from Eschbach, 2020 and how one could interpret the differences.

      True. This concern is correct. We now discuss the difference in more detail. Eschbach et al. used Cs-Crimson as a genetic tool, a one odor paradigm with 3 training cycles, and no gustatory cues in their approach. These differences are now discussed in the new version of the manuscript.

      (16) L460-475 A long part of this paragraph discusses the similarities between c1 and d1 and corresponding PPL1 neurons in the adult fly. However, c1 and d1 do not really show any phenotype in this paper, I'm not sure what we learn from this discussion and how much this paper can contribute to it. I would have wished for a discussion of how one could possibly reconcile the observed inconsistencies.

      Based on the comments of the different reviewers several paragraphs in the discussion were modified. We agree that the part on the larval-adult comparison is quite long. Thus we have shortened it as suggested by the reviewer.

      Minor corrections:

      L28 "resultant association" maybe resulting instead.

      L55 "animals derive benefit": remove derive.

      L78 "composing 12,000 neurons": composed of.

      L79 what is stable in a "stable behavioral assay"?

      L104: 2 times cluste.

      L122: "DL1 DANs are involved" in what?

      Fig. 1 please check subpanels labels, D repeats.

      L 362: "But how do individual neurons contribute to the teaching signal of the complete cluster?" I don't understand the question.

      L364 I did not hear before about the "labeled line hypothesis" in this context - could the author clarify?

      L368: edit "combinatorically".

      L390: "current suppression" maybe acute suppression.

      L 400 I'm not sure what is meant by "judicious functional configuration" and "redundancy". The functions of these cells are not redundant, and no straightforward prediction of their function can be done from their physiological response to salt.

      Thanks a lot for your in detail review of our manuscript. We welcome your well-taken concerns and have made the requested changes for all points that you have raised.

      Reviewer #2 (Recommendations For The Authors):

      (1) In Figure 1 the reconstruction of pPAM and DL1 DANs shows the compartmentalized innervation of the larval MB. However, the images are a bit low in color contrast to appreciate the innervation well. In particular in panel B, it is hard to identify the innervated MB body structure. A schematic model of the larval MB and DAN innervation domains like in Fig. 2A would help to clarify the innervation pattern to the non-specialist.

      We understand this concern and have changed figure 1 as suggested by the reviewer. A schematic model of the MB and DANs is now presented already in figure 1 as well as the according supplemental figure.

      (2) Blue light itself can be aversive for larvae and thus interfere with the aversive learning paradigm. Does the given Illuminance (220 lux) used in these experiments affect the behavior and learning outcome?

      Yes, in former times high intensities of blue light were necessary to trigger the first generation optogenetic tools. The high intensity blue light itself was able to establish an aversive memory (e.g. Rohwedder et al. 2016). Usage of the second generation optogenetic tools allowed us to strongly reduce the applied light intensity. Now we use 220 lux (equal to 60 µW/cm<sup>2</sup>). Please note that all Gal4 and UAS controls in the manuscript are nonsignificant different from zero. The mild blue light stimulation therefore does not serve as a teaching signal and has neither an aversive nor an appetitive effect. Furthermore, we use this mild light intensity for several other behavioral paradigms (locomotion, feeding, naïve preferences) and have never seen an effect on the behavior.

      (3) Fig.2: Except for MB054B-Gal4 only the MB expression pattern is shown for other lines. Is there any additional expression in other cells of the brain? In the legend in line 761, the reporter does not show endogenous expression, rather it is a fluorescent reporter signal labeling the mushroom body.

      The lines were initially identified by a screen on larval MB neurons done together with Jim Truman, Marta Zlatic and Bertram Gerber. Here full brain scans were always analyzed. These images can be seen in Eschbach et al. 2020, extended figure 1. Neither in their evaluation nor in our anatomical evaluation (using a different protocol) additional expression in brain cells was detectable. We also modified the figure legend as suggested.

      (4) Fig.3: Precise n numbers per experiment should be stated in the figure legend.

      True, we now present n numbers per experiment whenever necessary.

      (5) Fig.4: Have the authors confirmed complete ablation of the targeted neuron using rpr/hid? Ablations can be highly incomplete depending on the onset and strength of Gal4 expression, leaving some functionality intact. While the ablation experiments are largely in line with the acute silencing of single DANs during high salt learning performed later on (Fig.7), there is potentially an interesting aspect of developmental compensation hidden in this data. Not a major point, but potentially interesting to check.

      We agree with this criticism. We have not tested if the expression of hid,rpr in DL1 DANs does really ablate them. Therefore we did an additional experiment to show that. The new data is now present as a supplemental figure (Figure 4- figure supplement 6). The result shows that expression of hid,rpr ablates also DL1 DANs similar to earlier experiments where we used the same effectors to ablate serotoniergic neurons (Huser et al., 2012, figure 5).

      (6) The performance index in Fig. 4 and 5 sometimes seems lower and the variability is higher than in some of the other experiments shown. Is this due to the high intrinsic variability of these particular experiments, or the background effects of the rpr/hid or splitGal4 lines?

      The general variability of these experiments is within the expected and known borders. In these kind of experiments there is always some variation due to several external factors (e.g. experimental time over the year). Therefore it is always important to measure controls and experimental animals at the same time. Of course that’s what we did and we only compare directly results of individual datasets. But not between different datasets. This is further hampered given that the experiments of Figure 4 (now Figure 4- figure supplement 1) and Figure 5 (now Figure 4) differ in several parameters from other learning experiments presented later in the text. Optogenetic activation uses blue light stimulation instead of “real world” high salt. Most often direct activation of specific DANs in the brain is more stable than the external high salt stimulation. Also optogenetic inactivation uses blue light stimulation and also retinal supplemented food. Both factors can affect the measurement. We thus want to argue that it is for each experiment most often the particular parameters that affect the variability of the results rather than background effects of the rpr/hid and split-Gal4 lines.

      (7) Fig.7: This is a neat experiment showing the effects of acute silencing of individual DL1 DANs. As silencing DAN-f1/g1 does not result in complete suppression of aversive learning, it would be highly interesting to test (or speculate about) additive or modulatory effects by the other DANs. Dan-c-1/d-1 also responds to high salt but does not show function on its own in these assays. I am aware that this is currently genetically not feasible. It would however be a nice future experiment.

      True, we were intensively screening for DL1 cluster specific driver lines that cover all 4 DL1 neurons or other combinations than the ones we tested. Unfortunately, we did not succeed in identifying them. Nevertheless, we will further screen new genetic resources (e.g. Meissner et al., 2024, bioRxiv) to expand our approach in future experiments. Please also see our comment on concern 1 of reviewer 1 for further technical limitations and biological questions that can also potentially explain the absence of complete suppression of high salt learning and memory. Some of these limitations are now also mentioned and discussed in the new version of the manuscript.

      (8) The discussion is excellent. I would just amend that it is likely that larval DAN-c1, which has high interconnectivity within the larval CNS, is likely integrating state-dependent network changes, similar to the role of some DANs in innate and state-dependent preference behavior. This might contribute to modulating learned behavior depending on the present (acute) and previous environmental conditions.

      Thanks a lot for bringing this up. We rewrote this part and added a discussion on recent work on DAN-c1 function in larvae as well as results on DAN function in innate and state-dependent preference behavior.

      (9) Citation in line 1115 missing access information: "Schnitzer M, Huang C, Luo J, Je Woo S, Roitman L, et al. 2023. Dopamine signals integrate innate and learned valences to regulate memory dynamics. Research Square".

      Unfortunately this escaped our notice. The paper is now published in Nature: Huang, C., Luo, J., Woo, S.J. et al. Dopamine-mediated interactions between short- and long-term memory dynamics. Nature 634, 1141–1149 (2024). https://doi.org/10.1038/s41586-024-07819-w. We have now changed the citation. The new citation includes the missing access information.

      Reviewer #3 (Recommendations For The Authors):

      Regarding my issue about salt specificity in the public review, I want to make clear that I do not suggest additional experiments, but to be very careful in phrasing the conclusions, in particular whenever referring to the experiments with optogenetic activation. This includes presenting these experiments as "(salt) substitution" experiments - inferring that the optogenetic activation would substitute for a natural salt punishment. As important and interesting as the experiments are, they simply do not allow such an interpretation at this point.

      Results, line 140ff: When presenting the results regarding TH-Gal4 crossed to ChR2-XXL, please cite Schroll et al. 2006 who demonstrated the same results for the first time.

      Thanks for mentioning this. We now cite Schroll et al. 2006 here in the text of the manuscript.

      Figure 3: The subfigure labels (ABC) are missing.

      Unfortunately this escaped our notice. Thanks a lot – we have now corrected this mistake.

      Figure 5: For I and L, it reads "salt replaced with fru", but the sketch on the left shows salt in the test. I assume that fructose was not actually present in the test, and therefore the figure can be misleading. I suggest separate sketches. Also, I and L are not mentioned in the figure legend.

      True, this is rather confusing. Based on the well taken concern we have changed the figure by adding a new and correct scheme for sugar reward learning that does not symbolize fructose during test.

      Figure S1: The experimental sketches for E,F and G,H seem to be mixed up.

      We thank the reviewer for bringing this up. In the new version we corrected this mistake.

      Figure S5: There are three sub-figures labelled with B. Please correct.

      Again, thanks a lot. We made the suggested correction in Figure S5.

      Discussion, line 353ff: this and the following sentences can be read as if the authors have discovered the DL-1 neurons as aversive teaching mediators in this study. However, Eschbach et al. 2020 already demonstrated very similar results regarding the optogenetic activation of single DL-1 DANs. I suggest to rephrase and cite Eschbach et al. 2020 at this point.

      That is correct. Our focus was on the gustatory pathway. The original discovery was made by Eschbach et al. We have now corrected this in the discussion and clarified our contribution. It was never our intention to hide this work, as the laboratory was also involved. Nevertheless, this is an annoying omission on our side.

      Line 385-387: this sentence is only correct with respect to Eschbach et al. 2020. Weiglein et al. 2021 used ChR2-XXL as an effector, but another training regimen.

      We understand this criticism. Therefore, we changed the sentence as suggested by the reviewer. See also our response on concern 15 of reviewer 1.

      Line 389ff: I do not understand this sentence. What is meant by persistent and current suppression of activity? If this refers to the behavioural experiments, it is misleading as in the hid, reaper experiments neurons are ablated and not suppressed in activity.

      We made the requested changes in the text. It is true that the ablation of a neuron throughout larval life is different from constantly blocking the output of a persisting neuron.

      Methods, line 615 ff: the performance index is said to be calculated as the difference between the two preferences, but the equation shows the average of the preferences.

      Thanks a lot. We are sorry for the confusion. We have carefully rewritten this part of the methods section to avoid any misunderstanding.

      When discussing the organization of the DL1 cluster, on several occasions I have the impression the authors use the terms "redundant" and "combinatorial" synonymously. I suggest to be more careful here. Redundancy implies that each DAN in principle can "do the job", whereas combinatorial coding implies that only a combination of DANs together can "do the job". If "the job" is establishing an aversive salt memory, the authors' results point to redundancy: no experimental manipulation totally abolished salt learning, implying that the non-manipulated neurons in each experiment sufficed to establish a memory; and several DANs, when individually activated, can establish an aversive memory, implying that each of them indeed can "do the job".

      Based on this concern we have rewritten the discussion as suggested to be more precise when talking about redundancy or combinatorial coding of the aversive teaching signal. Basically, we have removed all the combinatorial terms and replaced them by the term “redundancy”.

      The authors mix parametric and non-parametric statistical tests across the experiments dependent on whether the distribution of the data is normal or not. It would help readers if the authors would clearly state for which data which tests were used.

      We understand the criticism and now have added an additional supplemental file that includes all the information on the statistical tests applied and the distribution of the data.

    1. eLife Assessment

      This work presents a valuable approach based on a complex systems theoretical framework to characterize diet-host-microbe interactions and develop targeted bacteriotherapies through a three-phase workflow. Despite the partial support of the description and experimental setup of the 'complex systems theoretical approach,' the collected data are solid and advance our understanding of oxalate bacterial metabolism in microbial communities. This study will interest researchers working on gut microbiomes and the possible modulation of host-microbial interactions.

    2. Reviewer #2 (Public review):

      Summary:

      Using the well-studied oxalate-microbiome-host system, the authors propose a novel conceptual and experimental framework for developing targeted bacteriotherapies using a three-phase pre-clinical workflow. The third phase is based on a 'complex system theoretical approach' in which multi-omics technologies are combined in independent in vivo and in vitro models to successfully identify the most pertinent variables that influence specific phenotypes in diet-host-microbe systems. The innovation relies on the third phase since phase I and phase II are the dominant approaches everyone in the microbiome field uses.

      Strengths:

      The authors used a multidisciplinary approach which included i] fecal transplant of two distinct microbial communities into Swiss-Webster mice (SWM) to characterize the host response (hepatic response-transcriptomics) and microbial activity (untargeted metabolomics of the stool samples) to different oxalate concentrations; 2] longitudinal analysis of the N. albigulia gut microbiome composition in response to varying concentrations of oxalate by shotgun metagenomics, with deep bioinformatic analyses of the genomes assembled; and 3] development of synthetic microbial communities around oxalate metabolisms and evaluation of these communities' activity into oxalate degradation in vivo.

      Weaknesses:

      This study presents a valuable finding on the oxalate-microbiome-host system using a multitude of approaches. Although the multidisciplinary approach allows for a unique perspective on the system and more robust conclusions, it is challenging for any authors to present all the data clearly and systematically in a conclusive way-especially when introducing unfamiliar concepts such as a complex systems theoretical approach.

    3. Author response:

      The following is the authors’ response to the original reviews

      Reviewer #1 (Public review):

      Summary:

      This study experimentally examined diet-microbe-host interactions through a complex systems framework, centered on dietary oxalate. Multiple, independent molecular, animal, and in vitro experimental models were introduced into this research. The authors found that microbiome composition influenced multiple oxalate-microbe-host interfaces. Oxalobacter formigenes were only effective against a poor oxalate-degrading microbiota background and give critical new insights into why clinical intervention trials with this species exhibit variable outcomes. Data suggest that, while heterogeneity in the microbiome impacts multiple diet-host-microbe interfaces, metabolic redundancy among diverse microorganisms in specific diet-microbe axes is a critical variable that may impact the efficacy of bacteriotherapies, which can help guide patient and probiotic selection criteria in probiotic clinical trials.

      Thank you. The main message of this research, is that through complex modelling, we believe we have identified the critical variable (metabolic redundancy) that is responsible for the efficacy of probiotics designed to reduce oxalate levels, thus allowing for improved patient selection in clinical trials. We also believe that this process and the critical features identified can be translated to other critical microbial functions such as short chain fatty acid synthesis, secondary bile acid synthesis, and others.

      Strengths:

      The paper has made significant progress in both the depth and breadth of scientific research by systematically comparing multiple experimental methods across multiple dimensions. Particularly through in-depth analysis from the enzymatic perspective, it has not only successfully identified several key strains and redundant genes, which is of great significance for understanding the functions of enzymes, the characteristics of strains, and the mechanisms of genes in microbial communities, but also provided a valuable reference for subsequent experimental design and theoretical research.

      More importantly, the establishment of a novel research approach to probiotics and gut microbiota in this paper represents a major contribution to the current research field. The proposal of this new approach not only breaks through the limitations of traditional research but also offers new perspectives and strategies for the screening, optimization of probiotics, and the regulation of gut microbiota balance. This holds potential significant value for improving human health and the prevention and treatment of related diseases.

      Thank you for the comments. We believe that the approach taken here, which contrasts with conventional reductionist techniques, will be critical for translating gut microbiome research into actionable therapeutic approaches.

      Weaknesses:

      While the study has excellently examined the overall changes in microbial community structure and the functions of individual bacteria, it lacks a focused investigation on the metabolic cross-feeding relationships between oxalate-degrading bacteria and related microorganisms, failing to provide a foundational microbial community or model for future research. Although this paper conducts a detailed study on oxalate metabolism, it would be beneficial to visually present the enrichment of different microbial community structures in metabolic pathways using graphical models.

      Thank you for this critique.  In the current study, we broadly examined the response of the gut microbiota to dietary oxalate. Based on initial shotgun metagenomic results, we focused in on specific taxa and metabolic functions.  Through metagenomic and multiple culture-based studies, we quickly honed in on redundancy in oxalate-degrading function as a key feature for oxalate homeostasis. We believe that the defined microbial community we used for microbial transplants (particularly the taxonomic cohort) provides a strong, minimal community to explore oxalate homeostasis further. In fact, we are using this consortium in multiple follow-up studies to fully understand the cross-feeding that may occur among these microorganisms, as you suggest.  We note that figure 3 shows the change of species and metabolic pathways with oxalate exposure.   

      Furthermore, the authors have done a commendable job in studying the roles of key bacteria. If the interactions and effects of upstream and downstream metabolically related bacteria could be integrated, it would provide readers with even more meaningful information. By illustrating how these bacteria interact within the metabolic network, readers can gain a deeper understanding of the complex ecological and functional relationships within microbial communities. Such an integrated approach would not only enhance the scientific value of the study but also facilitate future research in this area.

      Thank you. We note that based on the collective data obtained in this study, that redundancy in the oxalate degradation is the critical feature that maintains oxalate homeostasis. However, we are interested potential metabolic interactions between microbes in our defined community and are currently investigating these interactions through extensive investigations.   

      Reviewer #2 (Public review):

      Summary:

      Using the well-studied oxalate-microbiome-host system, the authors propose a novel conceptual and experimental framework for developing targeted bacteriotherapies using a three-phase pre-clinical workflow. The third phase is based on a 'complex system theoretical approach' in which multi-omics technologies are combined in independent in vivo and in vitro models to successfully identify the most pertinent variables that influence specific phenotypes in diet-host-microbe systems. The innovation relies on the third phase since phase I and phase II are the dominant approaches everyone in the microbiome field uses.

      Thank you. As you note, the proposed phases I and II are the predominant approaches used. In fact, many clinical trials have been conducted to try and reduce urine oxalate in patients, based solely on mechanistic studies with Oxalobacter formigenes.  As noted in our manuscript, only 43% of those studies results in the intended outcome, necessitating the approach we took in the current study. Our results suggest that the reason for the high rate of failure, despite well established mechanisms, is due to insufficient patient selection that focused only on the presence or absence of O. formigenes, which is a species that exhibits very low prevalence and abundance in the human gut microbiota, normally.

      Strengths:

      The authors used a multidisciplinary approach which included:

      (1) fecal transplant of two distinct microbial communities into Swiss-Webster mice (SWM) to characterize the host response (hepatic response-transcriptomics) and microbial activity (untargeted metabolomics of the stool samples) to different oxalate concentrations;

      (2) longitudinal analysis of the N. albigulia gut microbiome composition in response to varying concentrations of oxalate by shotgun metagenomics, with deep bioinformatic analyses of the genomes assembled; and

      (3) development of synthetic microbial communities around oxalate metabolisms and evaluation of these communities' activity in oxalate degradation in vivo.

      Thank you for these comments.  In the complex modelling approach, we focused on complete microbiota from host species known to have high and low capacities for oxalate tolerance, combined with targeting specific metabolic functions vs. specific taxa that may include unknown functions important for oxalate metabolism.  Further, we examined the influence of our target communities on oxalate metabolism through multiple in vitro and in vivo studies.

      Weaknesses:

      However, I have concerns about the frame the authors tried to provide for a 'complex system theoretical approach' and how the data are interpreted within this frame. Several of the conclusions the authors provide do not seem to have sufficient data to support them.

      Thank you.  We have tried to address these concerns by adding an exhaustive figure that broadly represents our complex modelling approach that includes potential complex system-based hypotheses, how they were tested, and the host-microbiome-oxalate interactions found in our study.

      Recommendations for the authors:  

      Reviewer #2 (Recommendations for the authors):

      Major Concerns

      (1) The authors argue about the importance of bringing 'Complex System Theory' to the microbiome field systematically and consistently. However, the authors fail to introduce this theory throughout the entire manuscript. For example, the authors tried to describe key elements and their nomenclature, such as nodes and fractal layers, in the first part of the result section. But the description is wordy and not precise. It would be more useful if the authors connected the model description with a visual representation, such as a figure. Unfortunately, these elements are not emphasizing and carried across the results section and are not mentioned in the discussion section.

      We have now added a figure (Figure 7) that details this process extensively and ties each of our findings to the complex system model and nomenclature.  We have also reiterated how our results fit in the complex system model in the discussion.

      In addition, there is no straightforward approach to integrating multi-omics datasets to identify the variables that are determinants of the system. For example, Figure 1 focuses on the impact of the host, hepatic activity, to oxalate exposure on fecal transplants into Swiss Webster mice; Figure 2 focuses on the effects of oxalate exposure on stool metabolic activity, not only microbial metabolic activity, on fecal transplants into Swiss Webster mice; and Figure 3 focuses on microbiome responses to different oxalate concentration in Neotoma albigula. There is no "model" to really integrate the host, the microbiome activity, and the microbiome composition information. And, unfortunately, the data generated between experiments cannot directly integrate; see major concern # 2.

      Thank you.  We have made more clear the experimental approach and how it applied to understanding the critical factors that maintain oxalate homeostasis.  Specifically, Figure 1 established that the effect of oxalate on the host was dependent on the microbiota, rather than host genetics.  Figure 2 established the effect of oxalate on the gut microbiota was again dependent on the whole gut microbiota and that these oxalate-microbe effects also influenced oxalate-host effects through a direct multi-omic data integration.  Once we established that the oxalate effects on host and microbiota were dependent on the whole microbiota composition, Figure 3 then sought to figure out how oxalate impacted the gut microbiota, using our model of high oxalate tolerance (N. albigula). With the finding in Figure 3 that there were multiple genes attributed to the degradation of oxalate, or acetogenic, methanogenic, and sulfate reducing pathways, Figure 4 and relevant supplemental figures sought to quantify the redundancy of these pathways.  After establishing a very high degree of redundancy, we sought to use a culturomic approach to determine what environmental factors impacted oxalate metabolism and to evaluate oxalate metabolism using our defined, hypothesized communities of microorganisms.  Finally, figure 6 sought to validate our metagenomic, metabolomic, and culturomic results from multiple animal and in vitro models using targeted microbial transplants in mice.  While we did have some direct multi-omic data integration (Figures 2 and 3), the process employed here sought to systematically determine which factors were most important for the oxalate-microbiota-host relationship, and then to use those results to design the subsequent experiments.  We have added this description to the discussion, which helps to contextualize the complex system modelling approach we took here.

      Finally, the authors did not provide a novel variable that successfully influences oxalate degradation in the oxalate-microbiome-host system. The authors argue that "both resource availability and community composition impact oxalate metabolism," which we currently inferred by the failure of the clinical tries and do not provide a clear intervention strategy to develop functional bacteriotherapy. The identification of composition as an important variable that was predictable without any multi-omics approach was highlighted by the development of synthetic microbial communities. Synthetic microbial communities are critical to characterizing complex microbiomes. Still, the authors did not explain how this strategy can be used in their theoretical framework (that is their goal), and these communities are not well introduced across the manuscript; see major concern # 4.

      As stated, it is clear from the failed clinical trials that we do not fully understand what microbial features dictate oxalate homeostasis.  We have specifically identified, through fecal transplant studies, that microbial composition is critical for oxalate homeostasis and that diverse oxalate-degrading bacteria exist.  However, ours is the first study that explicitly shows that it is this diversity that controls oxalate homeostasis.  This is specifically ascertained through the targeted microbial transplants in mice whereby O. formigenes was given alone or with different combinations of other microorganisms.  In other words, we were able to replicate both successful and failed studies by manipulating which specific species were introduced into animals.  This is unprecedented in the literature.

      (2) The authors provide several conclusions that are not completely supported by the data available. For example:

      (a) Lines 236-239: "Within the framework of complex systems, results show microbe-host cooperation whereby oxalate effectively processed within the SW-NALB gut microbiota reduced overall liver activity, indicative of a beneficial impact." - The authors did not provide data related to oxalate levels of oxalate processing for this dataset.

      While we did not specifically quantify oxalate degradation for this specific study, as cited in the text when describing this Swiss-Webster, Neotoma albigula system, we have previously published multiple animal studies explicitly showing that the N. albigula animals were highly effective oxalate degraders, which is transferable to Swiss-Webster mice through fecal transplants. Since the gut microbiota’s impact on oxalate has been welll established through experiments by our group, the purpose of these specific experiments were to look the other way and examine the effect of oxalate on the gut microbiota of these two animal models.  In the referenced text, we again cited our studies showing that the SW-NALB system effectively degrades oxalate.

      (b) Lines 239-243: "Data also suggest that both the gut microbiota and the immune system are involved in oxalate remediation (redundancy), such that if oxalate cannot be neutralized in the gut microbiota or liver, then the molecule will be processed through host immune response mechanisms (fractality), in this case indicated through an overall increase in hepatic activity and specifically in mitochondrial activity." - The authors did not provide any evidence related to the immune system and oxalate metabolism.

      We corrected that statement as follows: “…in this case indicated through an overall increase in inflammatory cytokines with oxalate exposure combined with an ineffective oxalate-degrading microbiota (Figures S6a,b; S9a,b).”  In other words, if the liver and gut microbiota can’t eliminate a toxin, then the immune system must deal with it through inflammatory pathways.  Oxalate is a well established, pro-inflammatory compound.  Our data show that this is dependent on the gut microbiota.

      (c) Lines 250-252: "Following the diet trial, colon stool was collected post-necropsy and processed for untargeted metabolomics, which is a measure of total microbial metabolic output." - Although most metabolites in stool samples are indeed microbial, there are also host metabolites. So, it is not technically correct to relate the metabolomic analysis of stool samples to only microbial metabolic analysis. In addition, the authors discussed compounds such as alkaloids and cholesterol as microbial metabolites, which these compounds are more related to the diet and host correspondingly.

      We have corrected this to state: “total metabolites present in stool from the diet, microbial activity, and host activity”

      (d) Lines 270-273. "Specifically, the SW-NALB mice exhibit hallmarks of homeostatic feedback with oxalate exposure to maintain a consistent metabolic output, defined by the relatively small, net negative, microbial metabolite-hepatic gene network compared to the large, net positive, network of SW-SW mice." - How do the authors define oxalate homeostasis? In addition, do the authors imply feedback between the liver and the microbiome in which the microbiome responds to a liver response related to oxalate levels? Or could the observation in Figure 1 be explained just by microbial consumption of oxalate that would reduce the impact of oxalate that arrives at the liver?

      Oxalate homeostasis is defined in that sentence: “relatively small, net negative, microbial metabolite-hepatic gene network compared to the large, net positive, network of SW-SW mice” – in other words, for SW-NALB mice, oxalate did not produce a considerable change to either microbial or hepatic metabolic activity.  We did not really test the liver impact on gut microbiota and can’t speak to that.  We believe, based on Figure 2 data, that it is not just the degradation of oxalate that explains the lack of change in hepatic activity in SW-NALB mice, rather that the oxalate-induced shift in the gut microbiota metabolic activity broadly altered hepatic activity, as inferred from Figure 2 c.  We made this more clear in the results: “suggests that the oxalate-induced change in microbial metabolism is responsible for the change in hepatic activity”.

      (e) Lines 297-301: "The oxalate-dependent metagenomic divergence of the NALB gut microbiota (Figure 3), combined with the lack of change in the microbial metabolomic profile with oxalate exposure (Figure 2), suggest that oxalate stimulates taxonomically diverse, but metabolically redundant microorganisms, in support of maintaining homeostasis." - The authors cannot conclude anything related between taxonomic changes and microbial activity since the taxonomic data presented is for microbial enrichment in N. albigulia, and the "microbial activity data" is from the fecal transplantation experiment in SWM. These are two completely different systems with two completely different experimental designs.

      We have shown very similar results in that oxalate induces the taxonomic divergence for the NALB gut microbiota, in multiple previous studies.  The experiment in which a minimal, positive increase in microbial metabolites, was saw with oxalate was based on the SW-NALB model whereby Swiss-Webster mice have an NALB microbiota.  We show throughout the manuscript, that the impact of oxalate is very microbiota dependent and supports our claim.  However, the claim is hypothesis generating – that metabolic redundancy is important for oxalate homeostasis.  We modified our statement to make all of this more clear.   

      Related to microbial composition, the authors did not show data validating the efficiency of the fecal transplantations (allograft or xenograft) in the SWM after antibiotic treatment. They also did not show evidence of microbial composition dynamics in response to oxalate exposure.

      Again, the efficacy of fecal transplants, used in the way they were here, has been shown in multiple past studies of our group.  In past studies, we have extensively characterized the microbiota from fecal transplants and which taxa were associated with oxalate levels.  Therefore, that topic was not the focus of the current study, instead focusing on the oxalate impact on gut microbiota activity.  Our past studies, referenced multiple times through the current manuscript, were used in large part to help determine which microbes to include in our taxonomic cohort, as described in the manuscript.

      (f) Lines 301-303: "Given that data came from the same hosts sampled longitudinally, these data also reflect a microbiota that is adaptive to oxalate exposure, which is another important characteristic of complex systems." - In their dataset, what is the evidence that the microbiota of N. albigulia is adapted to oxalate exposure? Is the increase in genomes with pathways related to oxalate metabolism related to an increase of oxalate in the diet? If so, does the microbiota exposure with a higher oxalate concentration decrease the systemic level of oxalate? In neither of the experiments related to Figures 1 to 3, the authors showed a correlation of systemic oxalate levels with microbial composition, hepatic host response, or stool metabolism.

      Figure 3 explicitly shows the longitudinal impact of increasing levels of oxalate showing an increase in oxalate degrading genes (Figure 3d). The specific samples selected for analysis here come from a previous study in which we explicitly quantified changes to the gut microbiota composition and both stool and urine oxalate for every time point listed in figure 3a.  This information is explicitly stated in the methods coupled with the fact that “neither fecal nor urinary oxalate levels increased significantly.”  Again, the effect of the gut microbiota on oxalate in these model systems have been extensively studied by our group and provide the foundation for the current study to look at the effect of oxalate on the gut microbiota and host.

      Considering my last two points, the authors do not present substantial evidence to support their hypothesis that oxalate stimulates taxonomically diverse, metabolically redundant communities.

      As stated above, that oxalate stimulates taxonomically diverse taxa was ascertained through multiple past studies, as well as the current study (Figure 3e).  The metabolically redundant part is ascertained both through untargeted metabolomics (Figure 2a,b) and shotgun metagenomics (Figure 3c,d).  Further evidence for the metabolic redundancy with oxalate comes from our culturomic approach, which showed that 14.58% of isolates could grow on oxalate as a carbon and energy source, in addition to the high proportion of isolates that could grow on other carbon and energy sources, at least much more than can be ascribed to a single species  (Figure 5c).  We made this more clear in the discussion.

      (g) Lines 330-335. "Additionally, the broad diversity of species that contain oxalate-related genes suggests that the distribution of metabolic genes is somewhat independent of the distribution of microbial species, which suggests that microbial genes exist in an autonomous fractal layer, to some degree. This hypothesis is supported by studies which show a high degree of horizontal gene transfer within the gut microbiota as a means of adaptation." - This conclusion is highly speculative, especially since the author did not do any analysis to directly evaluate a relationship between the oxalate metabolic pathways and the microbial species where these pathways are present.

      Figure 3c,d,e explicitly shows the metabolic pathways and species enriched by oxalate exposure.  Figure 4d, generated using the same data from Figure 3, explicitly shows the taxa that harbor oxalate-degrading genes.   

      (h) Lines 364-366. "Collectively, data show that both resource availability and community composition impacts oxalate metabolism, which helps to define the adaptive nature of the NALB gut microbiota." - The authors indeed showed evidence that community composition impacts oxalate metabolism. However, the authors did not show any evidence to directly evaluate the resource availability to impact oxalate metabolism.

      This is explicitly shown through in vitro community-based and single species assays varying multiple different carbon and energy sources to quantify changes to oxalate degradation (chosen based on shotgun metagenomic results; Figure 5a,b).

      (3) Lines 321-325. "Acetogenic genes were also present in 97.18% of genomes, dominated by acetate kinase and formate-tetrahydrofolate ligase (Figure S3A323C). Methanogenic genes were present in 100% of genomes, dominated by phosphoserine phosphatase, atpdependent 6-phosphofructokinase, and phosphate acetyltransferase (Figure S4A-C)." - The authors spent much time analyzing the adjacent pathways related to oxalate and oxalaterelated products of oxalate metabolism. However, my understanding is that the genes used to analyze these pathways (formate metabolism, acetogenesis, methanogenesis), such as the ones named above, are not unique/specific for those pathways but participate in other "housekeeping" pathways. What is the relevance of these analyses when those genes are not unique/specific to the function/pathways that the authors describe? If I infer correctly, these bioinformatic analyses aim to evaluate the hypothesis of whether oxalate metabolism could be a social/cooperation metabolism and whether other species could participate in the metabolism of oxalate subproducts. However, these analyses did not explicitly evaluate this hypothesis.

      The reviewer is correct in that we aimed to evaluate the potential that oxalate metabolism could benefit from metabolic cooperation.  The specific genes chosen for this analysis were those explicitly listed in the target metabolic pathways in KEGG, as described.  However, while the analyses do show the strong potential that the CO2 and formate produced from oxalate degradation could be used in these other pathways, as intended, the genes can be used in other metabolic pathways.  We did, however, explicitly test the hypothesis that formate, produced from oxalate degradation, could be utilized by the gut microbiota.  While the targeted transplants with the taxonomic cohort did not clearly show the use of formate in this way, those from the metabolic cohort did (Figures 6d and S8d).  This question is still in ongoing investigations in our group.  

      We have made it more clear that our genome analyses provide the potential for metabolic redundancy rather than definitive proof for metabolic redundancy, which was evaluated more extensively in other experiments from this study.

      (a) Lines 481-484. "Collectively, data offer strong support for the hypothesis that metabolic redundancy among diverse taxa, is the primary driver of oxalate homeostasis, rather than metabolic cooperation in which the by-products of oxalate degradation are used in downstream pathways such as acetogenesis, methanogenesis, and sulfate reduction." - Although the authors recognize that their data about the metabolic cooperation hypothesis is inconclusive, they never tested the hypothesis related to metabolic cooperation, as mentioned above. This is highly speculative.

      As stated above, the targeted microbial transplants to animals and in vitro studies (Figure 5e,f) did explicitly test the cooperation hypothesis, but it the results did not support it and instead pointed much more strongly to metabolic redundancy.    

      (4) Lines 355-359. "Cohorts, defined in the STAR methods, were used to delineate hypotheses that either carbon and energy substrates are sufficient to explain known effects of the oxalate-degrading microbial network or that additional aspects of taxa commonly stimulated by dietary oxalate are required to explain past results (taxa defined through previous meta-analysis of studies)." - The definition of the metabolic cohorts and the taxonomic cohorts should not be hidden in the material and methods section. It should be explicit and clearly explained in the main text. Related, the table presented in Figure 5D is exceptionally confusing and does not help to understand and differentiate between the metabolic and the taxonomic cohorts. The authors need to explicitly identify the synthetic communities used in each cohort and each group by their members and their characteristics in supplementary tables.

      In the sentences before those referenced, we state: “Culturomic data recapitulates molecular data to show a considerable amount of redundancy surrounding oxalate metabolism (Fig. 5C). Isolates generated from this assay were used for subsequent study (metabolic cohort; Figure 5D). Additionally, a second cohort was defined and commercially purchased based both on known metabolic functions and the proportion of studies that saw an increase in their taxonomic population with oxalate consumption (Fig. 5D; taxonomic cohort). Where possible, isolates from human sources were obtained.”  Figure 5d explicitly shows the specific species used in each cohort along with the groups they were in for transplant studies, the explicit metabolic pathways we were targeting, along with the % of studies that these species were associated with oxalate metabolism.  All of this information is both in the main text of the results and in the figure legends.  It is not hidden in the methods, but the methods do reiterate what was also placed in the results.   

      In Figures 5 and 6, the authors used the following groups with the corresponding nomenclature: 'Group 1, No_bact; Group 2, Ox; Group 3, Ox_form; Group 4, All; Group 5, No_ox'. Although the information related to these groups is present in the material and method section in lines 1139-1143, the authors also need to explicitly explain the groups and their nomenclature in the main text.

      Since this information is explicitly and succinctly given in the referenced figures, I believe that adding the same information in the text would be too redundant.

      Related to the development of the synthetic communities. How did the authors prepare the synthetic communities or 'cohort' for the in vitro experiments? 

      We added more information for the preparation of microbes and execution of the in vitro assays, as needed.  

      Also, it is unclear in the material and method section how the metabolic profile of each isolated was evaluated (Figure 5C). Related to the bacteria isolated from the culturomic assays, including Figure 5C and metabolic cohort, the authors indeed reported the isolation methodology in lines 1262-1275. However, there is no information about the sequencing of these isolates. The authors should present these isolates as a list (supplementary table) with their names, taxonomy, metabolic profile, and Genome ID if these genomes were submitted to NCBI.

      We added additional information for how metabolic cohort isolates were chosen and how they were taxonomically identified.  The taxonomy and substrate utilization of isolates are in Figure 5D.  We did not sequence the genomes of metabolic cohort bacteria.  However, the ATCC isolates, which comprise the taxonomic cohort, are publicly available.

      The author presented the 248 metagenomics assembles in Figure S1 in a circular chart in context with other genomes. However, the metagenomic assembles should be presented in a table form, with their name, taxonomy, coverage, completeness, and Genome ID, if these genomes were submitted to NCBI.

      The information for the genomes submitted to the NCBI is provided in the data availability statement.  However, we added a table (Table S9) that includes the requested information.   

      (5) Lines 371-3374: "To delineate hypotheses of metabolic redundancy or cooperation for mitigating the negative effects of oxalate on the gut microbiota and host, two independent diet trials were conducted with analogous microbial communities derived from the metabolic and taxonomic cohorts". 

      Lines 494-496: "we and others have found that oxalate can differentially exhibit positive or negative effects on microbial growth and metabolism dependent on the species and environment present" - What is the evidence that oxalate has a negative effect on the gut microbiota? The authors clearly showed the negative effect of oxalate on the host. Although there are reports in the literature of oxalate consumers with a negative effect on the microbiome, such as Lactobacilli and Bifidobacteria, there is no evidence in this manuscript about a negative effect of oxalate on the microbiome, and there is not an experimental design to evaluate it.

      These data are presented in Figure 2A and B.  As stated, oxalate led to a net reduction in total microbial metabolites produced of 34 metabolites, with a significant shift in overall metabolome, indicative of metabolic inhibition.  This is in comparison to the net gain of 9 metabolites, with no significant shift overall,  in the mice with the NALB microbiota.  The positive and negative effects of oxalate on the whole gut microbiota here are bolstered by previous studies on the effect of oxalate on pure cultures as discussed and cited on line 623624.

      (6) Related to the last section, it is hard to really compare the results of the taxonomic cohort versus the metabolic cohort when the data of one cohort is in the main figure and the other in a supplementary figure. In addition, all the comparisons between the two cohorts seem to be qualitative. For any comparisons, the authors need to do a statistical comparison between the groups of the two cohorts.

      The comparison of the two sets of data are indeed qualitative.  This is because these mouse models were run in separate experiments to test separate hypotheses (whether utilization of specific substrates is enough to improve oxalate metabolism or if specific taxa previously responsive to dietary oxalate was better, which is stated in the manuscript).  Given that these experimental models were tested separately, it would not be statistically valid to do a direct statistical comparison, even though the experimental procedures were the same and the only difference were the transplanted bacteria.  The separation of the experiments into a main and supplemental figure was done out of necessity given the very large amount of data and many experimental mouse models that were run in this study overall.   

      Minor Comments.

      (1) The authors should define 'antinutrients'. This term is not a familiar concept and could create confusion.

      This is defined in line 104 “molecules produced in plants to deter herbivory, disrupt homeostasis by targeting the function of the microbiome, host, or both”

      (2) The authors should explicitly describe the N. albigulia, aka White-throated woodrat system, as early as possible in the result section.

      We added some statements about the Swiss webster and N. albigula gut microbiota as poor and effective oxalate degraders in the second section of the results.

      (3) SW-SW mice exhibited an oxalate-dependent alteration of 219 hepatic genes, with a net increase in activity. In comparison, the SW-NALB mice exhibited an oxalate-dependent alteration of 21 genes with a net decrease in activity. However, the visual representation of the PCoA in Figure 1B showed that the most different samples are the SW-NALB 0% and 1.5%. Could you please explain this difference?

      In Figure 1b, the SW-NALB data are represented by the blue and black data points, which directly overlap with each other.  The SW-SW data are the orange and purple data points, which exhibit very little overlap.  

      (4) Is Table S7 the same as Table S6? If not, there is a missing supplementary table.

      These tables are different.  We ensured that both are present.

      (5) How did the authors test bacterial growth in in vivo studies (Figure 5B)?

      We added a statement to the culturomic section of the methods – we used media with or without oxalate and quantified colony-forming units.

      (6) A section of 16S rRNA metagenomics in the material and method section is not used across the main manuscript.

      These data are presented in figures S7 and S10, as stated in the results.  We added statements in the results to clarify that these figures show the 16S sequencing data.

      (7) Lines 506-511: "Collectively, data from the current and previous studies on the effect of oxalate exposure on the gut microbiota support the hypothesis that the gut microbiota serves as an adaptive organ in which specific, metabolically redundant microbes respond to and eliminate dietary components, for the benefit of themselves, but which can residually protect or harm host health depending on the dietary molecules and gut microbiota composition." - What is the benefit to bacteria in eliminating oxalate? This is highly speculative to this system.

      The benefit to bacteria is stated earlier in that paragraph – “In the current (Figs. 2B, 5B) and previous studies(33,34,64,65), we and others have found that oxalate can differentially exhibit positive or negative effects on microbial growth and metabolism dependent on the species and environment present.”

      (8) Lines 504 -506: "Importantly, the near-universal presence of formate metabolism genes suggest that formate may be an even greater source of ecological pressure (Figures S2-S5)."

      - Formate is primarily produced by fermentative anaerobic bacteria, such as Bacteroides, Clostridia, and certain species of Escherichia coli, since formate would be present in anaerobic communities independently of oxalate. How is formate an even greater source of ecological pressure?

      We added a statement about the toxicity of formate to both bacteria and mammalian hosts.

    1. eLife Assessment

      This intracranial EEG study presents important and convincing neural evidence supporting the high spatial specificity (receptive field) of visually driven alpha-band oscillation in human brains and its potential role in exogenous cuing attention. The work challenges the predominant view about the role of alpha-band oscillation in visual attention and advocates that stimulus-driven alpha suppression is precisely tuned and might contribute to exogenous spatial attention.

    2. Reviewer #1 (Public Review):

      In this study, the authors build upon previous research that utilized non-invasive EEG and MEG by analyzing intracranial human ECoG data with high spatial resolution. They employed a receptive field mapping task to infer the retinotopic organization of the human visual system. The results present compelling evidence that the spatial distribution of human alpha oscillations is highly specific and functionally relevant, as it provides information about the position of a stimulus within the visual field.

      Using state-of-the-art modeling approaches, the authors not only strengthen the existing evidence for the spatial specificity of the human dominant rhythm but also provide new quantification of its functional utility, specifically in terms of the size of the receptive field relative to the one estimated based on broad band activity.

    3. Reviewer #2 (Public Review):

      Summary:

      In this work, Yuasa et al. aimed to study the spatial resolution of modulations in alpha frequency oscillations (~10Hz) within the human occipital lobe. Specifically, the authors examined the receptive field (RF) tuning properties of alpha oscillations, using retinotopic mapping and invasive electroencephalogram (iEEG) recordings. The authors employ established approaches for population RF mapping, together with a careful approach to isolating and dissociating overlapping, but distinct, activities in the frequency domain. Whereby, the authors dissociate genuine changes in alpha oscillation amplitude from other superimposed changes occurring over a broadband range of the power spectrum. Together, the authors used this approach to test how spatially tuned estimated RFs were when based on alpha range activity, vs. broadband activities (focused on 70-180Hz). Consistent with a large body of work, the authors report clear evidence of spatially precise RFs based on changes in alpha range activity. However, the size of these RFs were far larger than those reliably estimated using broadband range activity at the same recording site. Overall, the work reflects a rigorous approach to a previously examined question, for which improved characterization leads to improved consistency in findings and some advance of prior work.

      Strengths:

      Overall, the authors take a careful and well-motivated approach to data analyses. The authors successfully test a clear question with a rigorous approach and provide strong supportive findings. Firstly, well-established methods are used for modeling population RFs. Secondly, the authors employ contemporary methods for dissociating unique changes in alpha power from superimposed and concomitant broadband frequency range changes. This is an important confound in estimating changes in alpha power not employed in prior studies. The authors show this approach produces more consistent and robust findings than standard band-filtering approaches. As noted below, this approach may also account for more subtle differences when compared to prior work studying similar effects.

      Original Weaknesses:

      - Theoretical framing: The authors frame their study as testing between two alternative views on the organization, and putative functions, of occipital alpha oscillations: i) alpha oscillation amplitude reflects broad shifts in arousal state, with large spatial coherence and uniformity across cortex; ii) alpha oscillation amplitude reflects more specific perceptual processes and can be modulated at local spatial scales. However, in the introduction this framing seems mostly focused on comparing some of the first observations of alpha with more contemporary observations. Therefore, I read their introduction to more reflect the progress in studying alpha oscillations from Berger's initial observations to the present. I am not aware of a modern alternative in the literature that posits alpha to lack spatially specific modulations. I also note this framing isn't particularly returned to in the discussion. A second important variable here is the spatial scale of measurement. It follows that EEG based studies will capture changes in alpha activity up to the limits of spatial resolution of the method (i.e. limited in ability to map RFs). This methodological distinction isn't as clearly mentioned in the introduction, but is part of the author's motivation. Finally, as noted below, there are several studies in the literature specifically addressing the authors question, but they are not discussed in the introduction.

      - Prior studies: There are important findings in the literature preceding the author's work that are not sufficiently highlighted or cited. In general terms, the spatio-temporal properties of the EEG/iEEG spectrum are well known (i.e. that changes in high frequency activity are more focal than changes in lower frequencies). Therefore, the observations of spatially larger RFs for alpha activities is highly predicted. Specifically, prior work has examined the impact of using different frequency ranges to estimate RF properties, for example ECoG studies in the macaque by Takura et al. NeuroImage (2016) [PubMed: 26363347], as well as prior ECoG work by the author's team of collaborators (Harvey et al., NeuroImage (2013) [PubMed: 23085107]), as well as more recent findings from other groups (Luo et al., (2022) BioRxiv: https://doi.org/10.1101/2022.08.28.505627). Also, a related literature exists for invasively examining RF mapping in the time-voltage domain, which provides some insight into the author's findings (as this signal will be dominated by low-frequency effects). The authors should provide a more modern framing of our current understanding of the spatial organization of the EEG/iEEG spectrum, including prior studies examining these properties within the context of visual cortex and RF mapping. Finally, I do note that the author's approach to these questions do reflect an important test of prior findings, via an improved approach to RF characterization and iEEG frequency isolation, which suggests some important differences with prior work.

      - Statistical testing: The authors employ many important controls in their processing of data. However, for many results there is only a qualitative description or summary metric. It appears very little statistical testing was performed to establish reported differences. Related to this point, the iEEG data is highly nested, with multiple electrodes (observations) coming from each subject, how was this nesting addressed to avoid bias?

      [Editors' note: the authors have addressed the original concerns.]

    4. Reviewer #3 (Public Review):

      Summary:

      This study tackles the important subject of sensory driven suppression of alpha oscillations using a unique intracranial dataset in human patients. Using a model-based approach to separate changes in alpha oscillations from broadband power changes, the authors try to demonstrate that alpha suppression is spatially tuned, with similar center location as high broadband power changes, but much larger receptive field. They also point to interesting differences between low-order (V1-V3) and higher-order (dorsolateral) visual cortex. While I find some of the methodology convincing, I also find significant parts of the data analysis, statistics and their presentation incomplete. Thus, I find that some of the main claims are not sufficiently supported. If these aspects could be improved upon, this study could potentially serve as an important contribution to the literature with implications for invasive and non-invasive electrophysiological studies in humans.

      Strengths:

      The study utilizes a unique dataset (ECOG & high-density ECOG) to elucidate an important phenomenon of visually driven alpha suppression. The central question is important and the general approach is sound. The manuscript is clearly written and the methods are generally described transparently (and with reference to the corresponding code used to generate them). The model-based approach for separating alpha from broadband power changes is especially convincing and well-motivated. The link to exogenous attention behavioral findings (figure 8) is also very interesting. Overall, the main claims are potentially important, but they need to be further substantiated (see weaknesses).

      Original Weaknesses:

      I have three major concerns:

      (1) Low N / no single subject results/statistics: The crucial results of Figure 4,5 hang on 53 electrodes from four patients (Table 2). Almost half of these electrodes (25/53) are from a single subject. Data and statistical analysis seem to just pool all electrodes, as if these were statistically independent, and without taking into account subject-specific variability. The mean effect per each patient was not described in text or presented in figures. Therefore, it is impossible to know if the results could be skewed by a single unrepresentative patient. This is crucial for readers to be able to assess the robustness of the results. N of subjects should also be explicitly specified next to each result.

      (2) Separation between V1-V3 and dorsolateral electrodes: Out of 53 electrodes, 27 were doubly assigned as both V1-V3 and dorsolateral (Table 2, Figures 4,5). That means that out of 35 V1-V3 electrodes, 27 might actually be dorsolateral. This problem is exasperated by the low N. for example all the 20 electrodes in patient 8 assigned as V1-V3 might as well be dorsolateral. This double assignment didn't make sense to me and I wasn't convinced by the authors' reasoning. I think it needlessly inflates the N for comparing the two groups and casts doubts on the robustness of these analyses.

      (3) Alpha pRFs are larger than broadband pRFs: first, as broadband pRF models were on average better fit to the data than alpha pRF models (dark bars in Supp Fig 3. Top row), I wonder if this could entirely explain the larger Alpha pRF (i.e. worse fits lead to larger pRFs). There was no anlaysis to rule out this possibility. Second, examining closely the entire 2.4 section there wasn't any formal statistical test to back up any of the claims (not a single p-value is mentioned). It is crucial in my opinion to support each of the main claims of the paper with formal statistical testing.

      [Editors' note: the authors have addressed the original concerns.]

    5. Author response:

      The following is the authors’ response to the original reviews

      Reviewer #1 (Public Review):

      Summary

      In this study, the authors build upon previous research that utilized non-invasive EEG and MEG by analyzing intracranial human ECoG data with high spatial resolution. They employed a receptive field mapping task to infer the retinotopic organization of the human visual system. The results present compelling evidence that the spatial distribution of human alpha oscillations is highly specific and functionally relevant, as it provides information about the position of a stimulus within the visual field.

      Using state-of-the-art modeling approaches, the authors not only strengthen the existing evidence for the spatial specificity of the human dominant rhythm but also provide new quantification of its functional utility, specifically in terms of the size of the receptive field relative to the one estimated based on broad band activity.

      We thank the reviewer for their positive summary.

      Weakness 1.1

      The present manuscript currently omits the complementary view that the retinotopic map of the visual system might be related to eye movement control. Previous research in non-human primates using microelectrode stimulation has clearly shown that neuronal circuits in the visual system possess motor properties (e.g. Schiller and Styker 1972, Schiller and Tehovnik 2001). More recent work utilizing Utah arrays, receptive field mapping, and electrical stimulation further supports this perspective, demonstrating that the retinotopic map functions as a motor map. In other words, neurons within a specific area responding to a particular stimulus location also trigger eye movements towards that location when electrically stimulated (e.g. Chen et al. 2020).

      Similarly, recent studies in humans have established a link between the retinotopic variation of human alpha oscillations and eye movements (e.g., Quax et al. 2019, Popov et al. 2021, Celli et al. 2022, Liu et al. 2023, Popov et al. 2023). Therefore, it would be valuable to discuss and acknowledge this complementary perspective on the functional relevance of the presented evidence in the discussion section.

      The reviewer notes that we do not discuss the oculomotor system and alpha oscillations. We agree that the literature relating eye movements and alpha oscillations are relevant.

      At the Reviewer’s suggestion, we added a paragraph on this topic to the first section of the Discussion (section 3.1, “Other studies have proposed … “).

      Reviewer #2 (Public Review):

      Summary:

      In this work, Yuasa et al. aimed to study the spatial resolution of modulations in alpha frequency oscillations (~10Hz) within the human occipital lobe. Specifically, the authors examined the receptive field (RF) tuning properties of alpha oscillations, using retinotopic mapping and invasive electroencephalogram (iEEG) recordings. The authors employ established approaches for population RF mapping, together with a careful approach to isolating and dissociating overlapping, but distinct, activities in the frequency domain. Whereby, the authors dissociate genuine changes in alpha oscillation amplitude from other superimposed changes occurring over a broadband range of the power spectrum. Together, the authors used this approach to test how spatially tuned estimated RFs were when based on alpha range activity, vs. broadband activities (focused on 70-180Hz). Consistent with a large body of work, the authors report clear evidence of spatially precise RFs based on changes in alpha range activity. However, the size of these RFs were far larger than those reliably estimated using broadband range activity at the same recording site. Overall, the work reflects a rigorous approach to a previously examined question, for which improved characterization leads to improved consistency in findings and some advance of prior work.

      We thank the reviewer for the summary.

      Strengths:

      Overall, the authors take a careful and well-motivated approach to data analyses. The authors successfully test a clear question with a rigorous approach and provide strong supportive findings. Firstly, well-established methods are used for modeling population RFs. Secondly, the authors employ contemporary methods for dissociating unique changes in alpha power from superimposed and concomitant broadband frequency range changes. This is an important confound in estimating changes in alpha power not employed in prior studies. The authors show this approach produces more consistent and robust findings than standard band-filtering approaches. As noted below, this approach may also account for more subtle differences when compared to prior work studying similar effects.

      We thank the reviewer for the positive comments.

      Weaknesses:

      Weakness 2.1 Theoretical framing:

      The authors frame their study as testing between two alternative views on the organization, and putative functions, of occipital alpha oscillations: i) alpha oscillation amplitude reflects broad shifts in arousal state, with large spatial coherence and uniformity across cortex; ii) alpha oscillation amplitude reflects more specific perceptual processes and can be modulated at local spatial scales. However, in the introduction this framing seems mostly focused on comparing some of the first observations of alpha with more contemporary observations. Therefore, I read their introduction to more reflect the progress in studying alpha oscillations from Berger's initial observations to the present. I am not aware of a modern alternative in the literature that posits alpha to lack spatially specific modulations. I also note this framing isn't particularly returned to in the discussion.

      This was helpful feedback. We have rewritten nearly the entire Introduction to frame the study differently. The emphasis is now on the fact that several intracranial studies of spatial tuning of alpha (in both human and macaque) tend to show increases in alpha due to visual stimulation, in contrast to a century of MEG/EEG studies, from Berger to the present, showing decreases. We believe that the discrepancy is due to an interaction between measurement type and brain signals. Specifically, intracranial measurements sum decreases in alpha oscillations and increases in broadband power on the same trials, and both signals can be large. In contrast, extracranial measures are less sensitive to the broadband signals and mostly just measure the alpha oscillation. Our study reconciles this discrepancy by removing the baseline broadband power increases, thereby isolating the alpha oscillation, and showing that with iEEG spatial analyses, the alpha oscillation decreases with visual stimulation, consistent with EEG and MEG results.

      Weakness 2.2 A second important variable here is the spatial scale of measurement.

      It follows that EEG based studies will capture changes in alpha activity up to the limits of spatial resolution of the method (i.e. limited in ability to map RFs). This methodological distinction isn't as clearly mentioned in the introduction, but is part of the author's motivation. Finally, as noted below, there are several studies in the literature specifically addressing the authors question, but they are not discussed in the introduction.

      The new Introduction now explicitly contrasts EEG/MEG with intracranial studies and refers to the studies below.

      Weakness 2.3 Prior studies:

      There are important findings in the literature preceding the author's work that are not sufficiently highlighted or cited. In general terms, the spatio-temporal properties of the EEG/iEEG spectrum are well known (i.e. that changes in high frequency activity are more focal than changes in lower frequencies). Therefore, the observations of spatially larger RFs for alpha activities is highly predicted. Specifically, prior work has examined the impact of using different frequency ranges to estimate RF properties, for example ECoG studies in the macaque by Takura et al. NeuroImage (2016) [PubMed: 26363347], as well as prior ECoG work by the author's team of collaborators (Harvey et al., NeuroImage (2013) [PubMed: 23085107]), as well as more recent findings from other groups (Luo et al., (2022) BioRxiv: https://doi.org/10.1101/2022.08.28.505627). Also, a related literature exists for invasively examining RF mapping in the time-voltage domain, which provides some insight into the author's findings (as this signal will be dominated by low-frequency effects). The authors should provide a more modern framing of our current understanding of the spatial organization of the EEG/iEEG spectrum, including prior studies examining these properties within the context of visual cortex and RF mapping. Finally, I do note that the author's approach to these questions do reflect an important test of prior findings, via an improved approach to RF characterization and iEEG frequency isolation, which suggests some important differences with prior work.

      Thank you for these references and suggestions. Some of the references were already included, and the others have been added.

      There is one issue where we disagree with the Reviewer, namely that “the observations of spatially larger RFs for alpha activities is highly predicted”. We agree that alpha oscillations and other low frequency rhythms tend to be less focal than high frequency responses, but there are also low frequency non-rhythmic signals, and these can be spatially focal. We show this by demonstrating that pRFs solved using low frequency responses outside the alpha band (both below and above the alpha frequency) are small, similar to high frequency broadband pRFs, but differing from the large pRFs associated with alpha oscillations. Hence we believe the degree to which signals are focal is more related to the degree of rhythmicity than to the temporal frequency per se. While some of these results were already in the supplement, we now address the issue more directly in the main text in a new section called, “2.5 The difference in pRF size is not due to a difference in temporal frequency.”

      We incorporated additional references into the Introduction, added a new section on low frequency broadband responses to the Results (section 2.5), and expanded the Discussion (section 3.2) to address these new references.

      Weakness 2.4 Statistical testing:

      The authors employ many important controls in their processing of data. However, for many results there is only a qualitative description or summary metric. It appears very little statistical testing was performed to establish reported differences. Related to this point, the iEEG data is highly nested, with multiple electrodes (observations) coming from each subject, how was this nesting addressed to avoid bias?

      We reviewed the primary claims made in the manuscript and for each claim, we specify the supporting analyses and, where appropriate, how we address the issue of nesting. Although some of these analyses were already in the manuscript, many of them are new, including all of the analyses concerning nesting. We believe that putting this information in one place will be useful to the reader, and we now include this text as a new section in supplement, Graphical and statistical support for primary claims.

      Reviewer #2 (Recommendations For The Authors):

      Recommendation 2.1:

      Data presentation: In several places, the authors discuss important features of cortical responses as measured with iEEG that need to be carefully considered. This is totally appropriate and a strength of the author's work, however, I feel the reader would benefit from more depiction of the time-domain responses, to help better understand the authors frequency domain approach. For example, Figure 1 would benefit from showing some form of voltage trace (ERP) and spectrogram, not just the power spectra. In addition, part (a) of Figure 1 could convey some basic information about the timing of the experimental paradigm.

      We changed panel A of Figure 1 to include the timing of the experimental paradigm, and we added panels C and D to show the electrode time series before and after regression out of the ERP.

      Recommendation 2.2

      Update introduction to include references to prior EEG/iEEG work on spatial distribution across frequency spectrum, and importantly, prior work mapping RFs with different frequencies.

      We have addressed this issue and re-written our introduction. Please refer to our response in Public Review for further details.

      Recommendation 2.3

      Figure 3 has several panels and should be labeled to make it easier to follow.The dashed line in lower power spectra isn't defined in a legend and is missing from the upper panel - please clarify.

      We updated Figure 3 and reordered the panels to clarify how we computed the summary metrics in broadband and alpha for each stimulus location (i.e., the “ratio” values plotted in panel B). We also simplified the plot of the alpha power spectrum. It now shows a dashed line representing a baseline-corrected response to the mapping stimulus, which is defined in the legend and explained in the caption.

      Recommendation 2.4

      Power spectra are always shown without error shading, but they are mean estimates.

      We added error shading to Figures 1, 2 and 3.

      Recommendation 2.5

      The authors deal with voltage transients in response to visual stimulation, by subtracting out the trail averaged mean (commonly performed). However, the efficacy of this approach depends on signal quality and so some form of depiction for this processing step is needed.

      We added a depiction of the processing steps for regressing out the averaged responses in Figure 1 in an example electrode (panels C and D). We also show in the supplement the effect of regressing out the ERP on all the electrode pRFs. We have added Supplementary Figure 1-2.

      Recommendation 2.6

      I have a similar request for the authors latency correction of their data, where they identified a timing error and re-aligned the data without ground truth. Again, this is appropriate, but some depiction of the success of this correction is very critical for confirming the integrity of the data.

      We now report more detail on the latency correction, and also point out that any small error in the estimate would not affect our conclusions (4.6 ECoG data analysis | Data epoching). The correction was important for a prior paper on temporal dynamics (Groen et al, 2022), which used data from the same participants and estimated the latency of responses. In this paper, our analyses are in the spectral domain (and discard phase), so small temporal shifts are not critical. We now also link to the public code associated with that paper, which implemented the adjustment and quantified the uncertainty in the latency adjustment.

      More details on latency adjustment provided in section 4.6.

      Recommendation 2.7

      In many places the authors report their data shows a 'summary' value, please clarify if this means averaging or summation over a range.

      For both broadband and alpha, we derive one summary value (a scalar) for trial for each stimulus. For broadband, the summary metric is the ratio of power during a given trial and power during blanks, where power in a trial is the geometric mean of the power at each frequency within the defined band). This is equation 3 in the methods, which is now referred to the first time that summary metrics are mentioned in the results.  For alpha, the summary metric is the height of the Gaussian from our model-based approach. This is in equations 1 and 2, and is also now referred to the first time summary metrics are mentioned in the results.

      We added explanation of the summary metrics in the figure captions and results where they are first used, and also referred to the equations in the methods where they are defined.

      Recommendation 2.8

      The authors conclude: "we have discovered that spectral power changes in the alpha range reflect both suppression of alpha oscillations and elevation of broadband power." It might not have been the intention, but 'discovered' seems overstated.

      We agree and changed this sentence.

      Recommendation 2.9

      Supp Fig 9 is a great effort by the authors to convey their findings to the reader, it should be a main figure.

      We are glad you found Supplementary Figure 9 valuable. We moved this figure to the main text.

      Reviewer #3 (Public Review):

      Summary:

      This study tackles the important subject of sensory driven suppression of alpha oscillations using a unique intracranial dataset in human patients. Using a model-based approach to separate changes in alpha oscillations from broadband power changes, the authors try to demonstrate that alpha suppression is spatially tuned, with similar center location as high broadband power changes, but much larger receptive field. They also point to interesting differences between low-order (V1-V3) and higher-order (dorsolateral) visual cortex. While I find some of the methodology convincing, I also find significant parts of the data analysis, statistics and their presentation incomplete. Thus, I find that some of the main claims are not sufficiently supported. If these aspects could be improved upon, this study could potentially serve as an important contribution to the literature with implications for invasive and non-invasive electrophysiological studies in humans.

      We thank the reviewer for the summary.

      Strengths:

      The study utilizes a unique dataset (ECOG & high-density ECOG) to elucidate an important phenomenon of visually driven alpha suppression. The central question is important and the general approach is sound. The manuscript is clearly written and the methods are generally described transparently (and with reference to the corresponding code used to generate them). The model-based approach for separating alpha from broadband power changes is especially convincing and well-motivated. The link to exogenous attention behavioral findings (figure 8) is also very interesting. Overall, the main claims are potentially important, but they need to be further substantiated (see weaknesses).

      We thank the reviewer for the positive comments.

      Weaknesses:

      I have three major concerns:

      Weakness 3.1. Low N / no single subject results/statistics:

      The crucial results of Figure 4,5 hang on 53 electrodes from four patients (Table 2). Almost half of these electrodes (25/53) are from a single subject. Data and statistical analysis seem to just pool all electrodes, as if these were statistically independent, and without taking into account subject-specific variability. The mean effect per each patient was not described in text or presented in figures. Therefore, it is impossible to know if the results could be skewed by a single unrepresentative patient. This is crucial for readers to be able to assess the robustness of the results. N of subjects should also be explicitly specified next to each result.

      We have added substantial changes to deal with subject specific effects, including new results and new figures.

      • Figure 4 now shows variance explained by the alpha pRF broken down by each participant for electrodes in V1 to V3. We also now show a similar figure for dorsolateral electrodes in Supplementary Figure 4-2.

      • Figure 5, which shows results from individual electrodes in V1 to V3, now includes color coding of electrodes by participant to make it clear how the electrodes group with participant. Similarly, for dorsolateral electrodes, we show electrodes grouped by participant in Supplementary Figure 5-1. Same for Supplementary Figure 6-2.

      • Supplementary Figure 7-2 now shows the benefits of our model-based approach for estimating alpha broken down by individual participants.

      • We also now include a new section in the supplement that summarizes for every major claim, what the supporting data are and how we addressed the issue of nesting electrodes by participant, section Graphical and statistical support for primary claims.

      Weakness 3.2. Separation between V1-V3 and dorsolateral electrodes:

      Out of 53 electrodes, 27 were doubly assigned as both V1-V3 and dorsolateral (Table 2, Figures 4,5). That means that out of 35 V1-V3 electrodes, 27 might actually be dorsolateral. This problem is exasperated by the low N. for example all the 20 electrodes in patient 8 assigned as V1-V3 might as well be dorsolateral. This double assignment didn't make sense to me and I wasn't convinced by the authors' reasoning. I think it needlessly inflates the N for comparing the two groups and casts doubts on the robustness of these analyses.

      Electrode assignment was probabilistic to reflect uncertainty in the mapping between location and retinotopic map. The probabilistic assignment is handled in two ways.

      (1) For visualizing results of single electrodes, we simply go with the maximum probability, so no electrode is visualized for both groups of data. For example, Figure 5a (V1-V3) and supplementary Figure 5-1a (dorsolateral electrodes) have no electrodes in common: no electrode is in both plots.

      (2) For quantitative summaries, we sample the electrodes probabilistically (for example Figures 4, 5c). So, if for example, an electrode has a 20% chance of being in V1 to V3, and 30% chance of being in dorsolateral maps, and a 50% chance of being in neither, the data from that electrode is used in only 20% of V1-V3 calculations and 30% of dorsolateral calculations. In 50% of calculations, it is not used at all. This process ensures that an electrode with uncertain assignment makes no more contribution to the results than an electrode with certain assignment. An electrode with a low probability of being in, say, V1-V3, makes little contribution to any reported results about V1-V3. This procedure is essentially a weighted mean, which the reviewer suggests in the recommendations. Thus, we believe there is not a problem of “double counting”.

      The alternative would have been to use maximum probability for all calculations. However, we think that doing so would be misleading, since it would not take into account uncertainty of assignment, and would thus overstate differences in results between the maps.

      We now clarify in the Results that for probabilistic calculations, the contribution of an electrode is limited by the likelihood of assignment (Section 2.3). We also now explain in the methods why we think probabilistic sampling is important.

      Weakness 3.3. Alpha pRFs are larger than broadband pRFs:

      First, as broadband pRF models were on average better fit to the data than alpha pRF models (dark bars in Supp Fig 3. Top row), I wonder if this could entirely explain the larger Alpha pRF (i.e. worse fits lead to larger pRFs). There was no anlaysis to rule out this possibility.

      We addressed this question in a new paragraph in Discussion section 3.1 (“What is the function of the large alpha pRFs?”, paragraph beginning… “Another possible interpretation is that the poorer model fit in the alpha pRF is due to lower signal-to-noise”). This paragraph both refers to prior work on the relationship between noise and pRF size and to our own control analyses (Supplementary Figure 5-2).

      Weakness 3.4 Statistics

      Second, examining closely the entire 2.4 section there wasn't any formal statistical test to back up any of the claims (not a single p-value is mentioned). It is crucial in my opinion to support each of the main claims of the paper with formal statistical testing.

      We agree that it is important for the reader to be able to link specific results and analyses to specific claims. We are not convinced that null hypothesis statistical testing is always the best approach. This is a topic of active debate in the scientific community.

      We added a new section that concisely states each major claim and explicitly annotates the supporting evidence. (Section 4.7). Please also refer to our responses to Reviewer #2 regarding statistical testing (Reviewer weakness 2.4 “Statistical testing”)

      Weakness 3.5 Summary

      While I judge these issues as crucial, I can also appreciate the considerable effort and thoughtfulness that went into this study. I think that addressing these concerns will substantially raise the confidence of the readership in the study's findings, which are potentially important and interesting.

      We again thank the reviewer for the positive comments.

      Reviewer #3 (Recommendations For The Authors):

      Suggestions for how to address the three major concerns:

      Suggestion 3.1.

      I am very well aware that it's very hard to have n=30 in a visual cortex ECOG study. That's fine. Best practice would be to have a linear mixed effects model with patients as a random effect. However, for some figures with just 3-4 patients (Figure 4,5) the sample size might be too small even for that. At the very minimum, I would expect to show in figures/describe in text all results per patient (perhaps one can do statistics within each patient, and show for each patient that the effect is significant). Even in primate studies with just two subjects it is expected to show that the results replicate for subject A and B. It is necessary to show that your results don't depend on a single unrepresentative subject. And if they do, at least be transparent about it.

      We have addressed this thoroughly. Please see response to Weakness 3.1 (“Low N / no single subject results/statistics”).

      Suggestion 3.2.

      I just don't get it. I would simply assign an electrode to V1-V3 or dorsolateral cortex based on which area has the highest probability. It doesn't make sense to me that an electrode that has 60% of being in dorsolateral cortex and only 10% to be in V1-V3 would be assigned as both V1-V3 and dorsolateral. Also, what's the rationale to include such electrode in the analysis for let's say V1-V3 (we have weak evidence to believe it's there)? I would either assign electrodes based on the highest probability, or alternatively do a weighted mean based on the probability of each electrode belonging to each region group (e.g. electrode with 40% to be in V1-V3, will get twice the weight as an electrode who has 20% to be in V1-V3) but this is more complicated.

      We have addressed this issue. Please refer to our response in Public Review (“Weakness 3.2 Separation between V1-V3 and dorsolateral”) for details.

      Suggestion 3.3.

      First, to exclude the possibility that alpha pRF are larger simply because they have a worse fit to the neural data, I would show if there is a correlation between the goodnessof-fit and pRF size (for alpha and broadband signals, separately). No [negative] correlation between goodness-of-fit and pRF size would be a good sign. I would also compare alpha & broadband receptive field size when controlling for the goodness-of-fit (selecting electrodes with similar goodness-of-fit for both signals). If the results replicate this way it would be convincing.

      Second, there are no statistical tests in section 2.4, possibly also in others. Even if you employ bootstrap / Monte-Carlo resampling methods you can extract a p-value.

      We have addressed this issue. Please refer to our response in Public Review Point 3.3 (“Alpha pRFs are larger than broadband pRFs”) for further details.

      Suggestion 3.4.

      Also, I don't understand the resampling procedure described in lines 652-660: "17.7 electrodes were assigned to V1-V3, 23.2 to dorsolateral, and 53 to either " - but 17.7 + 23.2 doesn't add up to 53. It also seems as if you assign visual areas differently in this resampling procedure than in the real data - "and randomly assigned each electrode to a visual area according to the Wang full probability distributions". If you assign in your actual data 27 electrodes to both visual areas, the same should be done in the resampling procedure (I would expect exactly 35 V1-V3 and 45 dorsolateral electrodes in every resampling, just the pRFs will be shuffled across electrodes).

      We apologize for the confusion.

      We fixed the sentence above, clarified the caption to Table 2, and also explained the overall strategy of probabilistic resampling better. See response to Public Review point 3.2 for details.

      Suggestion 3.5.

      These are rather technical comments but I believe they are crucial points to address in order to support your claims. I genuinely think your results are potentially interesting and important but these issues need to be first addressed in a revision. I also think your study may carry implications beyond just the visual domain, as alpha suppression is observed for different sensory modalities and cortical regions. Might be useful to discuss this in the discussion section.

      Agree. We added a paragraph on this point to the Discussion (very end of 3.2).

    1. eLife Assessment

      This fundamental study examines whether synaptic cell adhesion molecules neuroligin 1-3 resident on astrocytes, rather than neurons, exert effect on synaptic structure and function. With compelling evidence, the authors report that deletion of neuroligins 1-3 specifically in astrocytes does not alter synapse formation or astrocyte morphology in the hippocampus or visual cortex. This study highlights the specific role of neuronal neuroligins rather than their astrocytic counterparts in synaptogenesis.

    2. Reviewer #1 (Public review):

      Astrocytes are known to express neuroligins 1-3. Within neurons, these cell adhesion molecules perform important roles in synapse formation and function. Within astrocytes, a significant role for neuroligin 2 in determining excitatory synapse formation and astrocyte morphology was shown in 2017. However, there has been no assessment of what happens to synapses or astrocyte morphology when all three major forms of neuroligins within astrocytes (isoforms 1-3) are deleted using a well characterized, astrocyte specific, and inducible cre line. By using such selective mouse genetic methods, the authors here show that astrocytic neuroligin 1-3 expression in astrocytes is not consequential for synapse function or for astrocyte morphology. They reach these conclusions with careful experiments employing quantitative western blot analyses, imaging and electrophysiology. They also characterize the specificity of the cre line they used. Overall, this is a very clear and strong paper that is supported by rigorous experiments. The discussion considers the findings carefully in relation to past work. This paper is of high importance, because it now raises the fundamental question of exactly what neuroligins 1-3 are actually doing in astrocytes. In addition, it enriches our understanding of the mechanisms by which astrocytes participate in synapse formation and function. The paper is very clear, well written and well illustrated with raw and average data.

      Comments on revisions:

      My previous comments have been addressed. I have no additional points to make and congratulate the authors.

    3. Reviewer #2 (Public review):

      In the present manuscript, Golf et al. investigate the consequences of astrocyte-specific deletion of Neuroligin (Nlgn) family cell adhesion proteins on synapse structure and function in the brain. Decades of prior research had shown that Neuroligins mediate their effects at synapses through their role in the postsynaptic compartment of neurons and their transsynaptic interaction with presynaptic Neurexins. More recently, it was proposed for the first time that Neuroligins expressed by astrocytes can also bind to presynaptic Neurexins to regulate synaptogenesis (Stogsdill et al. 2017, Nature). However, several aspects of the model proposed by Stogsdill et al. on astrocytic Neuroligin function conflict with prior evidence on the role of Neuroligins at synapses, prompting Golf et al. to further investigate astrocytic Neuroligin function in the current study. Using postnatal conditional deletion of Nlgn1-3 specifically from astrocytes in mice, Golf et al. show that virtually no changes in the expression of synaptic proteins or in the properties of synaptic transmission at either excitatory or inhibitory synapses are observed. Moreover, no alterations in the morphology of astrocytes themselves were found. To further extend this finding, the authors additionally analyzed human neurons co-cultured with mouse glia lacking expression of Nlgn1-4. No difference in excitatory synaptic transmission was observed between neurons cultured in the present of wildtype vs. Nlgn1-4 conditional knockout glia. The authors conclude that while Neuroligins are indeed expressed in astrocytes and are hence likely to play some role there, this role does not include any direct consequences on synaptic structure and function, in direct contrast to the model proposed by Stogsdill et al.

      Overall, this is a strong study that addresses a fundamental and highly relevant question in the field of synaptic neuroscience. Neuroligins are not only key regulators of synaptic function, they have also been linked to numerous psychiatric and neurodevelopmental disorders, highlighting the need to precisely define their mechanisms of action. The authors take a wide range of approaches to convincingly demonstrate that under their experimental conditions, Nlgn1-3 are efficiently deleted from astrocytes in vivo, and that this deletion does not lead to major alterations in the levels of synaptic proteins or in synaptic transmission at excitatory or inhibitory synapses, or in the morphology of astrocytes. While the co-culture experiments are somewhat more difficult to interpret due to lack of a control for the effect of wildtype mouse astrocytes on human neurons, they are also consistent with the notion that deletion of Nlgn1-4 from astrocytes has no consequences for the function of excitatory synapses. Together, the data from this study provide compelling and important evidence that, whatever the role of astrocytic Neuroligins may be, they do not contribute substantially to synapse formation or function under the conditions investigated.

    4. Author response:

      The following is the authors’ response to the original reviews

      Public Reviews:

      Reviewer #1 (Public Review):

      Astrocytes are known to express neuroligins 1-3. Within neurons, these cell adhesion molecules perform important roles in synapse formation and function. Within astrocytes, a significant role for neuroligin 2 in determining excitatory synapse formation and astrocyte morphology was shown in 2017. However, there has been no assessment of what happens to synapses or astrocyte morphology when all three major forms of neuroligins within astrocytes (isoforms 1-3) are deleted using a well characterized, astrocyte specific, and inducible cre line. By using such selective mouse genetic methods, the authors here show that astrocytic neuroligin 1-3 expression in astrocytes is not consequential for synapse function or for astrocyte morphology. They reach these conclusions with careful experiments employing quantitative western blot analyses, imaging and electrophysiology. They also characterize the specificity of the cre line they used. Overall, this is a very clear and strong paper that is supported by rigorous experiments. The discussion considers the findings carefully in relation to past work. This paper is of high importance, because it now raises the fundamental question of exactly what neuroligins 1-3 are actually doing in astrocytes. In addition, it enriches our understanding of the mechanisms by which astrocytes participate in synapse formation and function. The paper is very clear, well written and well illustrated with raw and average data.

      We thank the reviewer for the balanced and informative summary.

      Reviewer #2 (Public Review):

      In the present manuscript, Golf et al. investigate the consequences of astrocyte-specific deletion of Neuroligin family cell adhesion proteins on synapse structure and function in the brain. Decades of prior research had shown that Neuroligins mediate their effects at synapses through their role in the postsynaptic compartment of neurons and their transsynaptic interaction with presynaptic Neurexins. More recently, it was proposed for the first time that Neuroligins expressed by astrocytes can also bind to presynaptic Neurexins to regulate synaptogenesis (Stogsdill et al. 2017, Nature). However, several aspects of the model proposed by Stogsdill et al. on astrocytic Neuroligin function conflict with prior evidence on the role of Neuroligins at synapses, prompting Golf et al. to further investigate astrocytic Neuroligin function in the current study. Using postnatal conditional deletion of Neuroligins 1, 2 and 3 specifically from astrocytes, Golf et al. show that virtually no changes in the expression of synaptic proteins or in the properties of synaptic transmission at either excitatory or inhibitory synapses are observed. Moreover, no alterations in the morphology of astrocytes themselves were found. The authors conclude that while Neuroligins are indeed expressed in astrocytes and are hence likely to play some role there, this role does not include any direct consequences on synaptic structure and function, in direct contrast to the model proposed by Stogsdill et al.

      Overall, this is a strong study that addresses an important and highly relevant question in the field of synaptic neuroscience. Neuroligins are not only key regulators of synaptic function, they have also been linked to numerous psychiatric and neurodevelopmental disorders, highlighting the need to precisely define their mechanisms of action. The authors take a wide range of approaches to convincingly demonstrate that under their experimental conditions, no alterations in the levels of synaptic proteins or in synaptic transmission at excitatory or inhibitory synapses, or in the morphology of astrocytes, are observed.

      We are also grateful for this reviewer’s constructive comments.

      One caveat to this study is that the authors do not directly provide evidence that their Tamoxifen-inducible conditional deletion paradigm does indeed result in efficient deletion of all three Neuroligins from astrocytes. Using a Cre-dependent tdTomato reporter line, they show that tdTomato expression is efficiently induced by the current paradigm, and they refer to a prior study showing efficient deletion of Neuroligins from neurons using the same conditional Nlgn1-3 mouse lines but a different Cre driver strategy. However, neither of these approaches directly provide evidence that all three Neuroligins are indeed deleted from astrocytes in the current study. In contrast, Stogsdill et al. employed FACS and qPCR to directly quantify the loss of Nlgn2 mRNA from astrocytes. This leaves the current Golf et al. study somewhat vulnerable to the criticism, however unlikely, that their lack of synaptic effects may be a consequence of incomplete Neuroligin deletion, rather than a true lack of effect of astrocytic Neuroligins.

      The concern is valid. In the original submission of this paper, we did not establish that the Cre recombinase we used actually deleted neuroligins in astrocytes. We have now addressed this issue in the revised paper with new experiments as described below.

      However, the reviewer’s impression that the Stogsdill et al. paper confirmed full deletion of Nlgn2 is a misunderstanding of the data in that paper. The reviewer is correct that Stogsdill et al. performed FACS to test the efficacy of the GLAST-Cre mediated deletion of Nlgn2-flox mice, followed by qRT-PCR comparing heterozygous with homozygous mutant mice. With their approach, no wild-type control could be used, as these would lack reporter expression. However, this experiment does NOT allow conclusions about the degree of recombination, both overall recombination (i.e. recombination in all astrocytes regardless of TdT+) and recombination in TdT+ astrocytes because it doesn’t quantify recombination. To quantify the degree of recombination, the paper would have had to perform genomic PCR measurements.  

      The problem with the data on the degree of recombination in the Stogsdill et al. (2017) paper, as we understand them, is two-fold.

      First, the GLAST-Cre line only targets ~40-70% of astrocytes, at least as evidenced by highly sensitive Cre-reporter mice in a variety of studies using this Cre line. The 40-70% variation is likely due to differences in the reporter mice and the tamoxifen injection schedule used. In comparison, we are targeting most astrocytes using the Aldh1l1-CreERT2 mice. Moreover, GLAST-Cre mice exhibit neuronal off-targeting, consistent with at least some of the remaining Nlgn2 qRT-PCR signal in the FACS-sorted cells. As we describe next, this signal also likely comes from astrocytes where recombination was incomplete This is the reason why we, like everyone else, are now using the Aldh1l1-Cre line that has been shown to be more efficient both in terms of the overall targeting of astrocytes (i.e. nearly complete) and the level of recombination observed in reporter(+) astrocytes.

      Second, Stogsdill et al. detected a significant decrease in the Nlgn2 qRT-PCR signal in the FACS-sorted homozygous Nlgn2 KO cells compared to the heterozygous Nlgn2 KO cells but the Nlgn2 qRT-PCR signal was still quite large. The data is presented as normalized to the HET condition. As a result, we don’t know the true level of gene deletion (i.e. compared to TdT- astrocytes). For example, based on the Stogsdill et al. data the HET manipulation could have induced only a 20% reduction in Nlgn2 mRNA levels in TdT(+) astrocytes, in which case the KO would have produced a 40% reduction in Nlgn2 mRNA in TdT(+) astrocytes. Moreover, it is possible based on our own experience with the GLAST-Cre line, that the reporter may also not turn on in some astrocytes where other alleles have been independently recombined – just as some astrocytes that are Td(+) would still be wild-type or heterozygous for Nlgn2. Thus, it is impossible to calculate the actual percentage of recombination from these data, even in TdT(+) cells, absent of PCR of genomic DNA from isolated cells. Alternatively, comparison of mRNA levels using primers sensitive to floxed sequences in wild-type controls versus cKO mice would have also yielded a much better idea of the recombination efficiency.

      In summary, it is unclear whether the Nlgn2 deletion in the Stogsdill et al. paper was substantial or marginal – it is simply impossible to tell.

      Reviewer #3 (Public Review):

      This study investigates the roles of astrocytes in the regulation of synapse development and astrocyte morphology using conditional KO mice carrying mutations of three neuroligins1-3 in astrocytes with the deletion starting at two different time points (P1 and P10/11). The authors use morphological, electrophysiological, and cell-biological approaches and find that there are no differences in synapse formation and astrocyte cytoarchitecture in the mutant hippocampus and visual cortex. These results differ from the previous results (Stogsdill et al., 2017), although the authors make several discussion points on how the differences could have been induced. This study provides important information on how astrocytes and neurons interact with each other to coordinate neural development and function. The experiments were well-designed, and the data are of high quality.

      We also thank this reviewer for helpful comments!

      Recommendations for the authors:

      This project was meant to rigorously test the intriguing overall question whether neuroligins, which are abundantly expressed in astrocytes, regulate synapse formation as astrocytic synapse organizers. The goal of the paper was NOT to confirm or dispute the conclusion by Stogsdill et al. (Nature 2017) that Nlgn2 expressed in astrocytes is essential for excitatory synapse formation and that astrocytic Nlgn1-3 are required for proper astrocyte morphogenesis. Instead, the project was meant to address the much broader question whether the abundant expression of any neuroligin, not just Nlgn2, in astrocytes is essential for neuronal excitatory or inhibitory synapse formation and/or for the astrocyte cytoarchitecture. We felt that this was an important question independent of the Stogsdill et al. paper. We analyzed in our experiments young adult mice, a timepoint that was chosen deliberately to avoid the possibility of observing a possible developmental delay rather than a fundamental function that extends beyond development.

      We do recognize that the conclusion by Stogsdill et al. (2017) that Nlgn2 expression in astrocytes is essential for excitatory synapse formation was very exciting to the field but contradicted a large literature demonstrating that Nlgn2 protein is exclusively localized to inhibitory synapses and absent from excitatory synapses (to name just a few papers, see Graf et al., Cell 2004; Varoqueaux et al., Eur. J. Cell Biol. 2004; Patrizi et al., PNAS 2008;  Hoon et al., J. Neurosci. 2009). In addition, the conclusion of Stogsdill et al. that astrocytic Nlgn2 specifically drove excitatory synapse formation was at odds with previous findings documenting that the constitutive deletion of Nlgn2 in all cells, including astrocytes, has no effect on excitatory synapse numbers (again, to name a few papers, see Varoqueaux et al., Neuron 2006; Blundell et al., Genes Brain Behav. 2008; Poulopoulos et al., Neuron 2009; Gibson et al., J. Neurosci. 2009). These contradictions conferred further urgency to our project, but please note that this project was primarily driven by our curiosity about the function of astrocytic neuroligins, not by a fruitless desire to test the validity of one particular Nature paper.

      The general goal of our paper notwithstanding, few papers from our lab have received as much attention and as many negative comments on social media as this paper when it was published as a preprint. Because we take these criticisms seriously, we have over the last year performed extensive additional experiments to ensure that our findings are well founded. We feel that, on balance, our data are incompatible with the notion that astrocytic neuroligins play a fundamental role in excitatory synapse formation but are consistent with other prior findings obtained with neuroligin KO mice. In the new data we added to the paper, we not only characterized the Cre-mediated deletion of neuroligins in depth, but also employed an independent second system -human neurons cultured on mouse glia- to further validate our conclusions as described below. Although we believe that our results are incompatible with the notion that astrocytic neuroligins fundamentally regulate excitatory or inhibitory synapse formation, we also conclude with regret that we still don’t know what astrocytic neuroligins actually do. Thus, the function of astrocytic neuroligins, as there surely must be one, remains a mystery.

      Finally, there are many possible explanations for the discrepancies between our conclusions and those of Stogsdill et al. as described in our paper. Most of these explanations are technical and may explain why not only our, but also the results of many other previous studies from multiple labs, are inconsistent with the conclusions by Stogsdill et al. (2017), as discussed in detail in the revised paper.

      Reviewer #1 (Recommendations For The Authors):

      The paper is very clear and well written. I have only one comment and that is to increase the sizes of Figs 2, 4 and 6 so that the imaging panels can be seen more clearly. Also, although I know the n numbers are provided in the figure legends, the authors may help the reader by providing them in the results when key data and findings are reported.

      We agree and have followed the reviewer’s suggestions as best as we could.

      Reviewer #2 (Recommendations For The Authors):

      (1) Given the strength and importance of the claims that the authors make, I would highly recommend adding some quantitative evidence regarding the efficacy of deletion in astrocytes, e.g. using the same strategy as in Stogsdill et al. As unlikely as it may be that Neuroligin deletion is in fact incomplete, this possibility cannot be excluded unless directly measured. To avoid future discussions on this subject, it seems that the onus is on the authors to provide this information.

      We concur that this is an important point and have devoted a year-long effort to address it. Note, however, that the strategy employed by Stogsdill et al. does not actually allow conclusions about their recombination efficiency. As described above, it only allows the conclusion that some recombination took place. The Stogsdill et al. Nature paper (2017) is a bit confusing on this point. This approach is thus not appropriate to address the question raised by the reviewer.

      We have performed two experiments to address the issue raised by the reviewer.

      First, we used a viral (i.e. AAV2/5) approach to express Rpl22 with a triple HA-tag, also known as Ribotag, which allows us to purify ribosome-bound mRNA from targeted cells for downstream gene expression analysis. The novel construct is driven by the GfaABC1D promoter and includes two additional features which make it particularly useful. First, upstream of Ribotag is a membrane-targeted, Lck-mVenus followed by a self-cleaving P2A sequence. This allows easy visualization of targeted astrocytes. Second, we have incorporated a cassette of four copies of six miRNA targeting sequences (4x6T) for mIR-124 as was recently published (Gleichman et al., 2023) to eliminate off-target expression in neurons. Based on qPCR analysis, the updated construct allowed >95% de-enrichment of neuronal mRNA and slightly improved observed recombination rates (~10% per gene) relative to an earlier version without 4x6T. Mice that were injected with tamoxifen at P1, similar to other experiments in the paper, were then stereotactically injected at ~P35-40 within the dorsal hippocampus with AAV2/5-GfaABC1D-Lck-mVenus-P2A-Rpl22-HA-4x6T. Approximately 3 weeks later, acute slices were prepared, visualized for fluorescence, and both CA1 and nearby cortex that was partially targeted were isolated for downstream ribosome affinity purification with HA antibodies. Total RNA was saved as input. qPCR was performed using assays that are sensitive to the exons that are floxed in the Nlgn123 cKO mice, so that our quantifications are not confounded by potential differences in non-sense mediated decay. Our control data reveals a striking enrichment of an astrocyte marker gene (e.g. aquaporin-4) and de-enrichment of genes for other cell types. In the CA1, we observed robust loss of Nlgn3 (~96%), Nlgn2 (~86%), and Nlgn1 (65%) gene expression. Similarly, in the cortex, we observed a similarly robust loss of Nlgn3 (93%), Nlgn2 (83%), and Nlgn1 (72%) expression. Given that our targeting of astrocytes based on Ai14 Cre-reporter mice was ~90-99%, these reductions are striking and definitive. The existence of some residual transcript reflects the presence of a small population of astrocytes heterozygous for Nlgn2 and Nlgn3. In contrast, Nlgn1 appears more difficult to recombine and it is likely that some astrocytes are either heterozygous or homozygous knockout cells. Although it is thus possible that Nlgn1 could provide some compensation in our experiments, it is worth noting that Stogsdill et al. found that only Nlgn2 and Nlgn3 knockdown with shRNAs resulted in impaired astrocyte morphology by P21. Moreover, they found that Nlgn2 cKO in astrocytes with PALE of a Cre-containing pDNA impaired astrocyte morphology in a gene-dosage dependent manner and suppressed excitatory synapse formation at P21. Thus, our inability to delete all of Nlgn1 doesn’t readily explain contradictions between our findings and theirs.

      Second, in an independent approach we have cultured glia from mouse quadruple conditional Nlgn1234 KO mice and infected the glia with lentiviruses expressing inactive (DCre, control) or active Cre-recombinase. We confirmed complete recombination by PCR. We then cultured human neurons forming excitatory synapses on the glia expressing or lacking neuroligins and measured the frequency and amplitude of mEPSCs as a proxy for synapse numbers and synaptic function. As shown in the new Figure 9, we detected no significant changes in mEPSCs, demonstrating in this independent system that the glial neuroligins do not detectably influence excitatory synapse formation.

      (2) Along the same lines, the authors should be careful not to overstate their findings in this direction. For example, the figure caption for Figure 2 reads 'Nlgn1-3 are efficiently and selectively deleted in astrocytes by crossing triple Nlgn1-3 conditional KO mice with Adh1l1-CreERT2 driver mice and inducing Cre-activity with tamoxifen early during postnatal development'. This is not technically correct and should be modified to reflect that the authors are not in fact assessing deletion of Nlgn1-3, but only expression of a tdTomato reporter.

      We agree – this is essentially the same criticism as comment #1.

      (3) In general, the animal numbers used for the experiments are rather low. With an n = 4 for most experiments, only large abnormalities would be detected anyway, while smaller alterations would not reach statistical significance due to the inherent biological and technical variance. For the most part, this is not a concern, since there really is no difference between WTs and Nlgn1-3 cKOs. However, trends are observed in some cases, and it is conceivable that these would become significant changes with larger n's, e.g. Figure 3H (Vglut2); Figure 4E (VGlut2 S.P., D.G.); Figure 6D (Vglut2). Increasing the numbers to n = 6 here would greatly strengthen the claims that no differences are observed.

      We concur that small differences would not have been detected in our experiments but feel that given the very large phenotypes of the neuroligin deletions in neurons and of the phenotypes reported by Stogsdill et al. (2017), which also did not employ a large number of animals, a very small phenotype in astrocytes would not have been very informative.

      Minor points:

      (1) Please state the exact genetic background for the mouse lines used.

      Our lab generally uses hybrid CD1/Bl6 mice to avoid artifacts produced by inbred genetic mutations in so-called ‘pure’ lines, especially Bl6 mice. This standard protocol was followed in the present study. Thus, the mice are on a mixed CD1/Bl6 hybrid background.

      Reviewer #3 (Recommendations For The Authors):

      (1) Figure 4 demonstrates that neuroligin 1-3 deletions restricted to astrocytes do not affect the number of excitatory and inhibitory synapses in layer IV of the primary visual cortex. This conclusion could be further strengthened if the authors could provide electrophysiological evidence such as mE/IPSCs.

      We agree but have chosen a different avenue to further test our conclusions because slice electrophysiological experiments are time-consuming, labor intensive, and difficult to quantitate, especially in cortex.

      Specifically, we have co-cultured human neurons with astrocytes that either contain or lack neuroligins (new Fig. 9). With this experimental design, we have total control over ALL neuroligins in astrocytes. Electrophysiological recordings then demonstrated that the complete deletion of all glial neuroligins has no effect on mEPSC frequencies and amplitudes. Although clearly much more needs to be done, the new results confirm in an independent system that glial neuroligins have no effect on synapse formation in the neurons, even though neurons depend on astrocytes for synaptogenic factors as Ben Barres brilliantly showed a decade ago. However, it is important to note that dissociated glia in culture, while synaptogenic, are reactive and may not faithfully recapitulate all roles of astrocytes in synaptogenesis.

      (2) It would help readers if the images showing the punctate double marker stainings of excitatory/inhibitory synapses are presented in merged colors (i.e., yellow colors for red and green puncta colors).

      We have tried to improve the visualization of the rather voluminous studies we performed and illustrate in the figures as best as we could.

      (3) The resolutions of the images in the figures are not good, although I guess it is because the images are for review processes.

      We apologize and would like to assure the reviewer that we are supplying high-resolution images to the journal.

      (4) Typos in lines 82 and 274.

      We have corrected these errors.

    1. eLife Assessment

      This important work combines theory and experiment to demonstrate convincingly how humans make decisions about sequences of pairs of correlated observations. The proposed model for evidence integration in correlated environments will be of use for the study of decision-making.

    2. Reviewer #1 (Public review):

      Summary:

      The behavioral strategies underlying decisions based on perceptual evidence are often studied in the lab with stimuli whose elements provide independent pieces of decision-related evidence that can thus be equally weighted to form a decision. In more natural scenarios, in contrast, the information provided by these pieces is often correlated, which impacts how they should be weighted. Tardiff, Kang & Gold set out to study decisions based on correlated evidence and compare observed behavior of human decision makers to normative decision strategies. To do so, they presented participants with visual sequences of pairs of localized cues whose location was either uncorrelated, or positively or negatively correlated, and whose mean location across a sequence determined the correct choice. Importantly, they adjusted this mean location such that, when correctly weighted, each pair of cues was equally informative, irrespective of how correlated it was. Thus, if participants follow the normative decision strategy, their choices and reaction times should not be impacted by these correlations. While Tardiff and colleagues found no impact of correlations on choices, they did find them to impact reaction times, suggesting that participants deviated from the normative decision strategy. To assess the degree of this deviation, Tardiff et al. adjusted drift diffusion models (DDMs) for decision-making to process correlated decision evidence. These fits, and a comparison of different model variants revealed that participants considered correlations when weighing evidence, but did so with a slight underestimation of magnitude of this correlation. This finding made Tardiff et al. conclude that participants followed a close-to normative decision strategy that adequately took into account correlated evidence.

      Strength:

      The authors adjust a previously used experimental design to include correlated evidence in a simple, yet powerful way. The way it does so is easy to understand and intuitive, such that participants don't need extensive training to perform the task. Limited training makes it more likely that the observed behavior is natural and reflective of every-day decision-making. Furthermore, the design allowed the authors to make the amount of decision-related evidence equal across different correlation magnitudes, which makes it easy to assess whether participants correctly take account of these correlations when weighing evidence: if they do, their behavior should not be impacted by the correlation magnitude.

      The relative simplicity with which correlated evidence is introduced also allowed the authors to fall back to the well-established DDM for perceptual decisions, that has few parameters, is known to implement the normative decision strategy in certain circumstances, and enjoys a great deal of empirical support. The authors show how correlations ought to impact these parameters, and which changes in parameters one would expect to see if participants mis-estimate these correlations or ignore them altogether (i.e., estimate correlations to be zero). This allowed them to assess the degree to which participants took into account correlations on the full continuum from perfect evidence weighting to complete ignorance. More specifically, the authors showed that a consistent mis-estimation of the correlation magnitude would not impact the fraction of correct choices (as they observe), but only the reaction times. With this, they could show that participants in fact performed rational evidence weighting if one assumed that they slightly underestimated the correlation magnitude.

      Weaknesses:

      While the authors convincingly demonstrate that the observed decision-making behavior seems to stem from a slight underestimation of the correlation magnitudes, their experimental paradigm did not allow them to determine the origin of this bias. Through additional analyses they rule out various possibilities, like the impact of a Bayesian prior on estimated correlations. Nonetheless, the authors provide no normative explanation of the observed bias.

      A further minor weakness is that the authors only focus on a single normative aspect of the observed behavior, namely on whether participants optimally accumulate decision-related evidence across time. Another question is whether participants tune their decision boundaries to maximize reward rates or some other overall performance measures. While the authors discuss that the chosen diffusion models (DDMs) have the potential of also implementing normative decisions in the latter sense, the authors' analysis does not address this question in the context of their task.

    3. Reviewer #2 (Public review):

      This study by Tardiff, Kang & Gold seeks to i) develop a normative account of how observers should adapt their decision-making across environments with different levels of correlation between successive pairs of observations, and ii) assess whether human decisions in such environments are consistent with this normative model. The authors first demonstrate that, in the range of environments under consideration here, an observer with full knowledge of the generative statistics should take both the magnitude and sign of the underlying correlation into account when assigning weight in their decisions to new observations: stronger negative correlations should translate into stronger weighting (due to the greater information furnished by an anticorrelated generative source), while stronger positive correlations should translate into weaker weighting (due to the greater redundancy of information provided by a positively correlated generative source). The authors then report an empirical study in which human participants performed a perceptual decision-making task requiring accumulation of information provided by pairs of perceptual samples, under different levels of pairwise correlation. They describe a nuanced pattern of results with effects of correlation being largely restricted to response times and not choice accuracy, which could be captured through fits of their normative model (in this implementation, an extension of the well-known drift diffusion model) to the participants' behaviour while allowing for mis-estimation of the underlying correlations. An intriguing result is that the observed pattern of behavioural effects is best explained by a model in which observers marginally underestimated the level of correlation between the generative sources, and that this bias affects behaviour through effects on stimulus encoding that then shape how the evidence furnished by each stimulus sample is weighted in decision formation.

      As the authors point out in their very well-written paper, appropriate weighting of information gathered in correlated environments has important consequences for real-world decision-making. Yet, while this function has been well studied for 'high-level' (e.g. economic) decisions, how we account for correlations when making simple perceptual decisions on well-controlled behavioural tasks has not been investigated. As such, this study addresses an important and timely question that will be of broad interest to psychologists and neuroscientists. The computational approach to arrive at normative principles for evidence weighting across environments with different levels of correlation is elegant, makes strong connections with prior work in different decision-making contexts, and should serve as a valuable reference point for future studies in this domain. The empirical study is well designed and executed, and the modelling approach applied to these data showcases an impressively deep understanding of relationships between different parameters of the drift diffusion model and its novel application to this setting. Another strength of the study is that it is preregistered.

      In my view, any major weaknesses of the study have been well addressed by the authors during review. An outstanding question that arises from the current work and remains unanswered here is around the (normative?) origin of the correlation underestimates, and the present work lays a strong foundation from which to pursue this question in the future.

    4. Author response:

      The following is the authors’ response to the original reviews

      We thank the reviewers for their thoughtful feedback. We have made substantial revisions to the manuscript to address each of their comments, as we detail below. We want to highlight one major change in particular that addresses a concern raised by both reviewers: the role of the drift rate in our models. Motivated by their astute comments, we went back through our models and realized that we had made a particular assumption that deserved more scrutiny. We previously assumed that the process of encoding the observations made correct use of the objective, generative correlation, but then the process of calculating the weight of evidence used a mis-scaled, subjective version of the correlation. These assumptions led us to scale the drift rate in the model by a term that quantified how the standard deviation of the observation distribution was affected by the objective correlation (encoding), but to scale the bound height by the subjective estimate of the correlation (evidence weighing). However, we realized that encoding may also depend on the subjective correlation experienced by the participant. We have now tested several alternative models and found that the best-fitting model assumes that a single, subjective estimate of the correlation governs both encoding and evidence weighing. An important consequence of updating our models in this way is that we can now account for the behavioral data without needing the additional correlation-dependent drift terms (which, as reviewer #2 pointed out, were difficult to explain).

      We also note that we changed the title slightly, replacing “weighting” with “weighing” for consistency with our usage throughout the manuscript.

      Please see below for more details about this important point and our responses to the reviewers’ specific concerns. 

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      The behavioral strategies underlying decisions based on perceptual evidence are often studied in the lab with stimuli whose elements provide independent pieces of decision-related evidence that can thus be equally weighted to form a decision. In more natural scenarios, in contrast, the information provided by these pieces is often correlated, which impacts how they should be weighted. Tardiff, Kang & Gold set out to study decisions based on correlated evidence and compare the observed behavior of human decision-makers to normative decision strategies. To do so, they presented participants with visual sequences of pairs of localized cues whose location was either uncorrelated, or positively or negatively correlated, and whose mean location across a sequence determined the correct choice. Importantly, they adjusted this mean location such that, when correctly weighted, each pair of cues was equally informative, irrespective of how correlated it was. Thus, if participants follow the normative decision strategy, their choices and reaction times should not be impacted by these correlations. While Tardiff and colleagues found no impact of correlations on choices, they did find them to impact reaction times, suggesting that participants deviated from the normative decision strategy. To assess the degree of this deviation, Tardiff et al. adjusted drift-diffusion models (DDMs) for decision-making to process correlated decision evidence. Fitting these models to the behavior of individual participants revealed that participants considered correlations when weighing evidence, but did so with a slight underestimation of the magnitude of this correlation. This finding made Tardiff et al. conclude that participants followed a close-to-normative decision strategy that adequately took into account correlated evidence.

      Strengths:

      The authors adjust a previously used experimental design to include correlated evidence in a simple, yet powerful way. The way it does so is easy to understand and intuitive, such that participants don't need extensive training to perform the task. Limited training makes it more likely that the observed behavior is natural and reflective of everyday decision-making. Furthermore, the design allowed the authors to make the amount of decision-related evidence equal across different correlation magnitudes, which makes it easy to assess whether participants correctly take account of these correlations when weighing evidence: if they do, their behavior should not be impacted by the correlation magnitude.

      The relative simplicity with which correlated evidence is introduced also allowed the authors to fall back to the well-established DDM for perceptual decisions, which has few parameters, is known to implement the normative decision strategy in certain circumstances, and enjoys a great deal of empirical support. The authors show how correlations ought to impact these parameters, and which changes in parameters one would expect to see if participants misestimate these correlations or ignore them altogether (i.e., estimate correlations to be zero). This allowed them to assess the degree to which participants took into account correlations on the full continuum from perfect evidence weighting to complete ignorance. With this, they could show that participants in fact performed rational evidence weighting if one assumed that they slightly underestimated the correlation magnitude.

      Weaknesses:

      The experiment varies the correlation magnitude across trials such that participants need to estimate this magnitude within individual trials. This has several consequences:

      (1) Given that correlation magnitudes are estimated from limited data, the (subjective) estimates might be biased towards their average. This implies that, while the amount of evidence provided by each 'sample' is objectively independent of the correlation magnitude, it might subjectively depend on the correlation magnitude. As a result, the normative strategy might differ across correlation magnitudes, unlike what is suggested in the paper. In fact, it might be the case that the observed correlation magnitude underestimates corresponds to the normative strategy.

      We thank the reviewer for raising this interesting point, which we now address directly with new analyses including model fits (pp. 15–24). These analyses show that the participants were computing correlation-dependent weights of evidence from observation distributions that reflected suboptimal misestimates of correlation magnitudes. This strategy is normative in the sense that it is the best that they can do, given the encoding suboptimality. However, as we note in the manuscript, we do not know the source of the encoding suboptimality (pp. 23–24). We thus do not know if there might be a strategy they could have used to make the encoding more optimal.

      (2) The authors link the normative decision strategy to putting a bound on the log-likelihood ratio (logLR), as implemented by the two decision boundaries in DDMs. However, as the authors also highlight in their discussion, the 'particle location' in DDMs ceases to correspond to the logLR as soon as the strength of evidence varies across trials and isn't known by the decision maker before the start of each trial. In fact, in the used experiment, the strength of evidence is modulated in two ways:

      (i) by the (uncorrected) distance of the cue location mean from the decision boundary (what the authors call the evidence strength) and

      (ii) by the correlation magnitude. Both vary pseudo-randomly across trials, and are unknown to the decision-maker at the start of each trial. As previous work has shown (e.g. Kiani & Shadlen (2009), Drugowitsch et al. (2012)), the normative strategy then requires averaging over different evidence strength magnitudes while forming one's belief. This averaging causes the 'particle location' to deviate from the logLR. This deviation makes it unclear if the DDM used in the paper indeed implements the normative strategy, or is even a good approximation to it.

      We appreciate this subtle, but important, point. We now clarify that the DDM we use includes degrees of freedom that are consistent with normative decision processes that rely on the imperfect knowledge that participants have about the generative process on each trial, specifically: 1) a single drift-rate parameter that is fit to data across different values of the mean of the generative distribution, which is based on the standard assumption for these kinds of task conditions in which stimulus strength is varied randomly from trial-to-trial and thus prevents the use of exact logLR (which would require stimulus strength-specific scale factors; Gold and Shadlen, 2001); 2) the use of a collapsing bound, which in certain cases (including our task) is thought to support a stimulus strength-dependent calibration of the decision variable to optimize decisions (Drugowitsch et al, 2012); and 3) free parameters (one per correlation) to account for subjective estimates of the correlation, which affected the encoding of the observations that are otherwise weighed in a normative manner in the best-fitting model.

      Also, to clarify our terminology, we define the objective evidence strength as the expected logLR in a given condition, which for our task is dependent on both the distance of the mean from the decision boundary and the correlation (p. 7). 

      Given that participants observe 5 evidence samples per second and on average require multiple seconds to form their decisions, it might be that they are able to form a fairly precise estimate of the correlation magnitude within individual trials. However, whether this is indeed the case is not clear from the paper.

      These points are now addressed directly in Results (pp. 23–24) and Figure 7 supplemental figures 1–3. Specifically, we show that, as the reviewer correctly surmised above, empirical correlations computed on each trial tended to be biased towards zero (Fig 7–figure supplement 1). However, two other analyses were not consistent with the idea that participants’ decisions were based on trial-by-trial estimates of the empirical correlations: 1) those with the shortest RTs did not have the most-biased estimates (Fig 7–figure supplement 2), and 2) there was no systematic relationship between objective and subjective fit correlations across participants (Fig 7–figure supplement 3).

      Furthermore, the authors capture any underestimation of the correlation magnitude by an adjustment to the DDM bound parameter. They justify this adjustment by asking how this bound parameter needs to be set to achieve correlation-independent psychometric curves (as observed in their experiments) even if participants use a 'wrong' correlation magnitude to process the provided evidence. Curiously, however, the drift rate, which is the second critical DDM parameter, is not adjusted in the same way. If participants use the 'wrong' correlation magnitude, then wouldn't this lead to a mis-weighting of the evidence that would also impact the drift rate? The current model does not account for this, such that the provided estimates of the mis-estimated correlation magnitudes might be biased.

      We appreciate this valuable comment, and we agree that we previously neglected the potential impact of correlation misestimates on evidence strength. As we now clarify, the correlation enters these models in two ways: 1) via its effect on how the observations are encoded, which involves scaling both the drift and the bound; and 2) via its effect on evidence weighing, which involves scaling only the bound (pp. 15–18). We previously assumed that only the second form of scaling might involve a subjective (mis-)estimate of the correlation. We now examine several models that also include the possibility of either or both forms using subjective correlation estimates. We show that a model that assumes that the same subjective estimate drives both encoding and weighing (the “full-rho-hat” model) best accounts for the data. This model provides better fits (after accounting for differences in numbers of parameters) than models with: 1) no correlation-dependent adjustments (“base” model), 2) separate drift parameters for each correlation condition (“drift” model), 3) optimal (correlation-dependent) encoding but suboptimal weighing (“bound-rho-hat” model, which was our previous formulation), 4) suboptimal encoding and weighing (“scaled-rho-hat” model), and 5) optimal encoding but suboptimal weighing and separate correlation-dependent adjustments to the drift rate (“boundrho-hat plus drift” model). We have substantially revised Figures 5–7 and the associated text to address these points.

      Lastly, the paper makes it hard to assess how much better the participants' choices would be if they used the correct correlation magnitudes rather than underestimates thereof. This is important to know, as it only makes sense to strictly follow the normative strategy if it comes with a significant performance gain.

      We now include new analyses in Fig. 7 that demonstrate how much participants' choices and RT deviate from: 1) an ideal observer using the objective correlations, and 2) an observer who failed to adjust for the fit subjective correlation when weighing the evidence (i.e., using the subjective correlation for encoding but a correlation of zero for weighing). We now indicate that participants’ performance was quite close to that predicted by the ideal observer (using the true, objective correlation) for many conditions. Thus, we agree that they might not have had the impetus to optimize the decision process further, assuming it were possible under these task conditions.

      Reviewer #2 (Public review):

      Summary:

      This study by Tardiff, Kang & Gold seeks to: i) develop a normative account of how observers should adapt their decision-making across environments with different levels of correlation between successive pairs of observations, and ii) assess whether human decisions in such environments are consistent with this normative model.

      The authors first demonstrate that, in the range of environments under consideration here, an observer with full knowledge of the generative statistics should take both the magnitude and sign of the underlying correlation into account when assigning weight in their decisions to new observations: stronger negative correlations should translate into stronger weighting (due to the greater information furnished by an anticorrelated generative source), while stronger positive correlations should translate into weaker weighting (due to the greater redundancy of information provided by a positively correlated generative source). The authors then report an empirical study in which human participants performed a perceptual decision-making task requiring accumulation of information provided by pairs of perceptual samples, under different levels of pairwise correlation. They describe a nuanced pattern of results with effects of correlation being largely restricted to response times and not choice accuracy, which could partly be captured through fits of their normative model (in this implementation, an extension of the well-known drift-diffusion model) to the participants' behaviour while allowing for misestimation of the underlying correlations.

      Strengths:

      As the authors point out in their very well-written paper, appropriate weighting of information gathered in correlated environments has important consequences for real-world decisionmaking. Yet, while this function has been well studied for 'high-level' (e.g. economic) decisions, how we account for correlations when making simple perceptual decisions on well-controlled behavioural tasks has not been investigated. As such, this study addresses an important and timely question that will be of broad interest to psychologists and neuroscientists. The computational approach to arrive at normative principles for evidence weighting across environments with different levels of correlation is very elegant, makes strong connections with prior work in different decision-making contexts, and should serve as a valuable reference point for future studies in this domain. The empirical study is well designed and executed, and the modelling approach applied to these data showcases a deep understanding of relationships between different parameters of the drift-diffusion model and its application to this setting. Another strength of the study is that it is preregistered.

      Weaknesses:

      In my view, the major weaknesses of the study center on the narrow focus and subsequent interpretation of the modelling applied to the empirical data. I elaborate on each below:

      Modelling interpretation: the authors' preference for fitting and interpreting the observed behavioural effects primarily in terms of raising or lowering the decision bound is not well motivated and will potentially be confusing for readers, for several reasons. First, the entire study is conceived, in the Introduction and first part of the Results at least, as an investigation of appropriate adjustments of evidence weighting in the face of varying correlations. The authors do describe how changes in the scaling of the evidence in the drift-diffusion model are mathematically equivalent to changes in the decision bound - but this comes amidst a lengthy treatment of the interaction between different parameters of the model and aspects of the current task which I must admit to finding challenging to follow, and the motivation behind shifting the focus to bound adjustments remained quite opaque. 

      We appreciate this valuable feedback. We have revised the text in several places to make these important points more clearly. For example, in the Introduction we now clarify that “The weight of evidence is computed as a scaled version of each observation (the scaling can be applied to the observations or to the bound, which are mathematically equivalent; Green and Swets, 1966) to form the logLR” (p. 3). We also provide more details and intuition in the Results section for how and why we implemented the DDM the way we did. In particular, we now emphasize that the correlation enters these models in two ways: 1) via its effect on encoding the observations, which scales both the drift and the bound; and 2) via its effect on evidence weighing, which scales only the bound (pp. 15–18).

      Second, and more seriously, bound adjustments of the form modelled here do not seem to be a viable candidate for producing behavioural effects of varying correlations on this task. As the authors state toward the end of the Introduction, the decision bound is typically conceived of as being "predefined" - that is, set before a trial begins, at a level that should strike an appropriate balance between producing fast and accurate decisions. There is an abundance of evidence now that bounds can change over the course of a trial - but typically these changes are considered to be consistently applied in response to learned, predictable constraints imposed by a particular task (e.g. response deadlines, varying evidence strengths). In the present case, however, the critical consideration is that the correlation conditions were randomly interleaved across trials and were not signaled to participants in advance of each trial - and as such, what correlation the participant would encounter on an upcoming trial could not be predicted. It is unclear, then, how participants are meant to have implemented the bound adjustments prescribed by the model fits. At best, participants needed to form estimates of the correlation strength/direction (only possible by observing several pairs of samples in sequence) as each trial unfolded, and they might have dynamically adjusted their bounds (e.g. collapsing at a different rate across correlation conditions) in the process. But this is very different from the modelling approach that was taken. In general, then, I view the emphasis on bound adjustment as the candidate mechanism for producing the observed behavioural effects to be unjustified (see also next point).

      We again appreciate this valuable feedback and have made a number of revisions to try to clarify these points. In addition to addressing the equivalence of scaling the evidence and the bound in the Introduction, we have added the following section to Results (Results, p.18):

      “Note that scaling the bound in these formulations follows conventions of the DDM, as detailed above, to facilitate interpretation of the parameters. These formulations also raise an apparent contradiction: the “predefined” bound is scaled by subjective estimates of the correlation, but the correlation was randomized from trial to trial and thus could not be known in advance. However, scaling the bound in these ways is mathematically equivalent to using a fixed bound on each trial and scaling the observations to approximate logLR (see Methods). This equivalence implies that in the brain, effectively scaling a “predefined” bound could occur when assigning a weight of evidence to the observations as they are presented.”

      We also note in Methods (pp. 40–41):

      “In the DDM, this scaling of the evidence is equivalent to assuming that the decision variable accumulates momentary evidence of the form (x1 + x2) and then dividing the bound height by the appropriate scale factor. An alternative approach would be to scale both the signal and noise components of the DDM by the scale factor. However, scaling the bound is both simpler and maintains the conventional interpretation of the DDM parameters in which the bound reflects the decision-related components of the evidence accumulation process, and the drift rate represents sensory-related components.”

      We believe we provide strong evidence that participants adjust their evidence weighing to account for the correlations (see response below), but we remain agnostic as to how exactly this weighing is implemented in the brain.

      Modelling focus: Related to the previous point, it is stated that participants' choice and RT patterns across correlation conditions were qualitatively consistent with bound adjustments (p.20), but evidence for this claim is limited. Bound adjustments imply effects on both accuracy and RTs, but the data here show either only effects on RTs, or RT effects mixed with accuracy trends that are in the opposite direction to what would be expected from bound adjustment (i.e. slower RT with a trend toward diminished accuracy in the strong negative correlation condition; Figure 3b). Allowing both drift rate and bound to vary with correlation conditions allowed the model to provide a better account of the data in the strong correlation conditions - but from what I can tell this is not consistent with the authors' preregistered hypotheses, and they rely on a posthoc explanation that is necessarily speculative and cannot presently be tested (that the diminished drift rates for higher negative correlations are due to imperfect mapping between subjective evidence strength and the experimenter-controlled adjustment to objective evidence strengths to account for effects of correlations). In my opinion, there are other candidate explanations for the observed effects that could be tested but lie outside of the relatively narrow focus of the current modelling efforts. Both explanations arise from aspects of the task, which are not mutually exclusive. The first is that an interesting aspect of this task, which contrasts with most common 'univariate' perceptual decision-making tasks, is that participants need to integrate two pieces of information at a time, which may or may not require an additional computational step (e.g. averaging of two spatial locations before adding a single quantum of evidence to the building decision variable). There is abundant evidence that such intermediate computations on the evidence can give rise to certain forms of bias in the way that evidence is accumulated (e.g. 'selective integration' as outlined in Usher et al., 2019, Current Directions in Psychological Science; Luyckx et al., 2020, Cerebral Cortex) which may affect RTs and/or accuracy on the current task. The second candidate explanation is that participants in the current study were only given 200 ms to process and accumulate each pair of evidence samples, which may create a processing bottleneck causing certain pairs or individual samples to be missed (and which, assuming fixed decision bounds, would presumably selectively affect RT and not accuracy). If I were to speculate, I would say that both factors could be exacerbated in the negative correlation conditions, where pairs of samples will on average be more 'conflicting' (i.e. further apart) and, speculatively, more challenging to process in the limited time available here to participants. Such possibilities could be tested through, for example, an interrogation paradigm version of the current task which would allow the impact of individual pairs of evidence samples to be more straightforwardly assessed; and by assessing the impact of varying inter-sample intervals on the behavioural effects reported presently.

      We thank the reviewer for this thoughtful and valuable feedback. We have thoroughly updated the modeling section to include new analysis and clearer descriptions and interpretations of our findings (including Figs. 5–7 and additional references to the Usher, Luyckx, and other studies that identified decision suboptimalities). The comment about “an additional computational step” in converting the observations to evidence was particularly useful, in that it made us realize that we were making what we now consider to be a faulty assumption in our version of the DDM. Specifically, we assumed that subjective misestimates of the correlation affected how observations were converted to evidence (logLR) to form the decision (implemented as a scaling of the bound height), but we neglected to consider how suboptimalities in encoding the observations could also lead to misestimates of the correlation. We have retained the previous best-fitting models in the text, for comparison (the “bound-rho-hat” and “bound-rho-hat + drift” models). In addition, we now include a “full-rho-hat” model that assumes that misestimates of rho affect both the encoding of the observations, which affects the drift rate and bound height, and the weighing of the evidence, which affects only the bound height. This was the best-fitting model for most participants (after accounting for different numbers of parameters associated with the different models we tested). Note that the full-rho-hat model predicts the lack of correlation-dependent choice effects and the substantial correlation-dependent RT effects that we observed, without requiring any additional adjustments to the drift rate (as we resorted to previously).

      In summary, we believe that we now have a much more parsimonious account of our data, in terms of a model in which subjective estimates of the correlation are alone able to account for our patterns of choice and RT data. We fully agree that more work is needed to better understand the source of these misestimates but also think those questions are outside the scope of the present study.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      A few minor comments:

      (1) Evidence can be correlated in multiple ways. It could be correlated within individual pieces of evidence in a sequence, or across elements in that sequence (e.g., across time). This distinction is important, as it determines how evidence ought to be accumulated across time. In particular, if evidence is correlated across time, simply summing it up might be the wrong thing to do. Thus, it would be beneficial to make this distinction in the Introduction, and to mention that this paper is only concerned with the first type of correlation.

      We now clarify this point in the Introduction (p. 5–6).

      (2) It is unclear without reading the Methods how the blue dashed line in Figure 4c is generated. To my understanding, it is a prediction of the naive DDM model. Is this correct?

      We now specify the models used to make the predictions shown in Fig. 4c (which now includes an additional model that uses unscaled observations as evidence).

      (3) In Methods, given the importance of the distribution of x1 + x2, it would be useful to write it out explicitly, e.g., x1 + x2 ~ N(2 mu_g, ..), specifying its mean and its variance.

      Excellent suggestion, added to p. 38.

      (4) From Methods and the caption of Figure 6 - Supplement 1 it becomes clear that the fitted DDM features a bound that collapses over time. I think that this should also be mentioned in the main text, as it is a not-too-unimportant feature of the model.

      Excellent suggestion, added to p. 15, with reference to Fig. 6-supplement 1 on p. 20.

      (5) The functional form of the bound is 2 (B - tb t). To my understanding, the effective B changes as a function of the correlation magnitude. Does tb as well? If not, wouldn't it be better if it does, to ensure that 2 (B - tb t) = 0 independent of the correlation magnitude?

      In our initial modeling, we also considered whether the correlation-dependent adjustment, which is a function of both correlation sign and magnitude, should be applied to the initial bound or to the instantaneous bound (i.e., after collapse, affecting tb as well). In a pilot analysis of data from 22 participants in the 0.6 correlation-magnitude group, we found that this choice had a negligible effect on the goodness-of-fit (deltaAIC = -0.9, protected exceedance probability = 0.63, in favor of the instantaneous bound scaling). We therefore used the instantaneous bound version in the analyses reported in the manuscript but doubt this choice was critical based on these results. We have clarified our implementation of the bound in Methods (p. 43–44).

      Reviewer #2 (Recommendations for the authors):

      In addition to the points raised above, I have some minor suggestions/open questions that arose from my reading of the manuscript:

      (1) Are the predictions outlined in the paper specific to cases where the two sources are symmetric around zero? If distributions are allowed to be asymmetric then one can imagine cases (i.e. when distribution means are sufficiently offset from one another) where positive correlations can increase evidence strength and negative correlations decrease evidence strength. There's absolutely still value and much elegance in what the authors are showing with this work, but if my intuition is correct, it should ideally be acknowledged that the predictions are restricted to a specific set of generative circumstances.

      We agree that there are a lot of ways to manipulate correlations and their effect on the weight of evidence. At the end of the Discussion, we emphasize that our results apply to this particular form of correlation (p. 32).

      (2) Isn't Figure 4C misleading in the sense that it collapses across the asymmetry in the effect of negative vs positive correlations on RT, which is clearly there in the data and which simply adjusting the correlation-dependent scale factor will not reproduce?

      We agree that this analysis does not address any asymmetries in suboptimal estimates of positive versus negative correlations. We believe that those effects are much better addressed using the model fitting, which we present later in the Results section. We have now simplified the analyses in Fig. 4c, reporting the difference in RT between positive and negative correlation conditions instead of a linear regression.

      (3) I found the transition on p.17 of the Results section from the scaling of drift rate by correlation to scaling of bound height to be quite abrupt and unclear. I suspect that many readers coming from a typical DDM modelling background will be operating under the assumption that drift rate and bound height are independent, and I think more could be done here to explain why scaling one parameter by correlation in the present case is in fact directly equivalent to scaling the other.

      Thank you for the very useful feedback, we have substantially revised this text to make these points more clearly.

      (4) P.3, typo: Alan *Turing*

      That’s embarrassing. Fixed.

      (5) P.27, typo: "participants adopt a *fixed* bound"

      Fixed.

    1. eLife Assessment

      This study focuses on the role of a T-cell-specific receptor, ctla-4, in a new zebrafish model of IBD-like phenotype. Although implicated in IBD diseases, the function of ctla-4 has been hard to study in mice as the KO is lethal. Ctla-4 mutant zebrafish exhibited significant intestinal inflammation and dysbiosis, mirroring the pathology of inflammatory bowel disease (IBD) in mammals, providing a new valuable model to the field of IBD research. This is an key study with convincing evidence, comprehensive transcriptomic analysis, histological examinations, and functional assays all supporting the findings.

    2. Reviewer #1 (Public review):

      "Unraveling the Role of Ctla-4 in Intestinal Immune Homeostasis: Insights from a novel Zebrafish Model of Inflammatory Bowel Disease" generates a 14bp deletion/early stop codon mutation that is viable in a zebrafish homolog of ctla-4. This mutant exhibits an IBD-like phenotype, including decreased intestinal length, abnormal intestinal folds, decreased goblet cells, abnormal cell junctions between epithelial cells, increased inflammation, and alterations in microbial diversity. Bulk and single-cell RNA-seq show upregulation of immune and inflammatory response genes in this mutant (especially in neutrophils, B cells, and macrophages) and downregulation of genes involved in adhesion and tight junctions in mutant enterocytes. The work suggests that the makeup of immune cells within the intestine is altered in these mutants, potentially due to changes in lymphocyte proliferation. Introduction of recombinant soluble Ctla-4-Ig to mutant zebrafish rescued body weight, histological phenotypes, and gene expression of several pro-inflammatory genes, suggesting a potential future therapeutic route.

      Strengths:

      - Generation of a useful new mutant in zebrafish ctla-4<br /> - The demonstration of an IBD-like phenotype in this mutant is extremely comprehensive.<br /> - Demonstrated gene expression differences provide mechanistic insight into how this mutation leads to IBD-like symptoms.<br /> - Demonstration of rescue with a soluble protein suggests exciting future therapeutic potential<br /> - The manuscript is mostly well organized and well written.

      Initial Weaknesses were addressed during review.

    3. Reviewer #2 (Public review):

      Summary:

      The authors aimed to elucidate the role of Ctla-4 in maintaining intestinal immune homeostasis by using a novel Ctla-4-deficient zebrafish model. This study addresses the challenge of linking CTLA-4 to inflammatory bowel disease (IBD) due to the early lethality of CTLA-4 knockout mice. Four lines of evidence were shown to show that Ctla-4-deficient zebrafish exhibited hallmarks of IBD in mammals: 1) impaired epithelial integrity and infiltration of inflammatory cells; 2) enrichment of inflammation-related pathways and the imbalance between pro- and anti-inflammatory cytokines; 3) abnormal composition of immune cell populations; and 4) reduced diversity and altered microbiota composition. By employing various molecular and cellular analyses, the authors established ctla-4-deficient zebrafish as a convincing model of human IBD.

      Strengths:

      The characterization of the mutant phenotype is very thorough, from anatomical to histological and molecular levels. The finding effectively established ctla-4 mutants as a novel zebrafish model for investigating human IBD. Evidence from the histopathological and transcriptome analysis was very strong and supports a severe interruption of immune system homeostasis in the zebrafish intestine. Additional characterization using sCtla-4-Ig further probed the molecular mechanism of the inflammatory response, and provided a potential treatment plan for targeting Ctla-4 in IBD models.

      Weaknesses:

      To probe the molecular mechanism of Ctla-4, the authors used a spectrum of antibodies that target Ctla-4 or its receptors. The phenotype assayed was lymphocyte proliferation, while it was the composition rather than number of immune cells that was observed to be different in the scRNASeq assay. Although sCtla-4 has an effect of alleviating the IBD-like phenotypes, I found this explanation a bit oversimplified.

      Comments on revised version:

      The authors have sufficiently addressed all my concerns and I don't have further suggestions.

    4. Reviewer #3 (Public review):

      Summary:

      Current study on the mutant zebrafish for IBD modeling is worth trying. The author provided lots of evidence, including histopathological observation, gut microflora, as well as intestinal tissue or mucosa cells' transcriptomic data. The multi-omic study has demonstrated the enteritis pathology at multi levels in zebrafish model.

      Strengths:

      The important immune checkpoint of Treg cells were knockout in zebrafish, and the enteritis were found then. It could be a substitution of mouse knockout model to investigate the molecular mechanism of gut disease.

      Weaknesses:

      (1) In Fig. 2I, as to the purple glycogen signals stained by PAS was ignored for the quantitative statistics. The purple stained area could be calculated by ImageJ.<br /> (2) Those characters in Fig. 3G are too small to recognize. It is suggested to adjusted this picture or just put it in the supplementation, with bigger size.<br /> (3) The tissue seems damaged for IgG ctrl in Fig. 8B. It is suggested to find another slice to present here.<br /> (4) Line 667 & 743: "16S rRNA sequencing" should be "16S rRNA gene sequencing". Please check this point throughout the text.

    1. eLife Assessment

      This translational study presents a direct cross-species comparison (between mice, rats, and humans) of choice behavior in the same perceptual decision-making task. The study is rare in opening a window on the evolution of decision-making, and the results will be important for many disciplines including behavioral sciences, psychology, neuroscience, and psychiatry. While the strength of the evidence presented is solid, the manuscript would benefit from additional information and analyses to strengthen and clarify its main conclusions.

    2. Reviewer #1 (Public review):

      This work presents data from three species (mice, rats, and humans) performing an evidence accumulation task, that has been designed to be as similar as possible between species (and is based on a solid foundation of previous work on decision-making). The tasks are well-designed, and the analyses are solid and clearly presented - showing that there are differences in the overall parameters of the decision-making process between the species. This is valuable to neuroscientists who aim to translate behavioral and neuroscientific findings from rodents to humans and offers a word of caution for the field in readily claiming that behavioral strategies and computations are representative of all mammals. The dataset would be of great interest to the community and may be a source of further modelling of across-species behavior, but unfortunately, neither data or code are currently shared.

      A few other questions remain, that make the conclusions of the paper a bit hard to assess:

      (1) The main weakness is that the authors claim that all species rely on evidence accumulation as a strategy, but this is not tested against other models (see e.g. Stine et al. https://elifesciences.org/articles/55365): the fact that the DDM fits rather well does not mean that this is the strategy that each species was carrying out.

      (2) In all main analyses, it is unclear what the effect is of the generative flash rate and how this has been calibrated between species. Only in Figure 6C do we see basic psychometric functions, but these should presumably also feature as a crucial variable dominating the accuracy and RTs (chronometric functions) across species. The very easy trials are useful to constrain the basic sensorimotor differences that may account for RT variability, e.g. perhaps the small body of mice requires them to move a relatively longer distance to trigger the response.

      (3) The GLM-HMM results (that mice are not engaged in all trials) are very important, but they imply that mouse DDM fits may well be more similar to rats and humans if done only on engaged trials. Could it be that the main species differences are driven by different engagement state occupations?

      (4) It would be very helpful if the authors could present a comprehensive overview (perhaps a table) of the factors that may be relevant for explaining the observed species differences. This may include contextual/experimental variables (age range (adolescent humans vs. mice/rats, see https://www.jax.org/news-and-insights/jax-blog/2017/november/when-are-mice-considered-old; reward source, etc) and also outcomes (e.g. training time required to learn the task, # trials per session and in total).

    3. Reviewer #2 (Public review):

      Summary:

      Chakravarty et al. propose a 'synchronized framework' for studying perceptual decision-making (DM) across species -namely humans, rats, and mice. Although all species shared hallmarks of evidence accumulation, the results highlighted species-specific differences. Humans were the slowest and most accurate, rats optimized the speed-accuracy tradeoff to maximize the reward rate and mice were the fastest but least accurate. In addition, while humans were better fit by a classic DDM with fixed bounds, rodents were better fit by a DDM with collapsing bounds. While comparing behavioral strategies in evidence accumulation tasks across species is an important and timely question, some of the presented differences across species lack a clear interpretation and could be simply caused by differences in the task design. There is important information and analyses missing about the DDM and the other models used, which lowers the confidence and enthusiasm about the results.

      Strengths:

      The comparison of behavior across species, including humans and commonly used laboratory species like rats and mice, is a fundamental step in neuroscience to establish more informed links between animal experiments and human cognition. In this work, Chakravarty et al. analyze and model the behavior of three species during the same evidence accumulation task. They draw conclusions about the different strategies used in each case.

      Weaknesses:

      Novelty:<br /> While quite relevant, some parts of the work presented are more novel than others. That EA drives choice behavior and these choices can be described with a DDM have been shown before (see e.g. (Kane et al. 2023; Brunton et al. in 2013; Pinto et al 2018)). The novelty here mostly lies in the comparison of three species in the same task and in fitting the same exact model (close quantitative comparison of behavioral strategies). However, some of the differences lack a clear interpretation. For instance, the values of some of the DDM fitted parameters between the three species are not ordered "as expected" (e.g. non-decision time or DDM BIC). Other comparison results completely lack an explanation (e.g. rats' RT are near optimal while humans and mice are not). The aspect that I found most novel and exciting is the application of HMMs to each of the species. However, this part comes at the end of the paper and has been done without sufficient depth. There is almost no explanation for the results. I would suggest the authors bring up this part and move back to other aspects which are, in my opinion, less novel or interpretable (e.g. results around the optimality of RT).

      Task design:<br /> Since there is no fixation, the response time (RT) reflects both the evidence integration time plus the motor time (stimuli are played until a response is given). This design makes it hard to compare RTs between species. While humans just had to press a button, rodents had to move their whole bodies from a central port to a side port. When comparing rats and mice, their difference in size relative to port distance could explain different RTs. This could for example explain the large difference in non-decision time (ndt) in Figure 3F between mice and rats. Are the measurements of the rat and the mouse boxes comparable? The authors should explain this difference more openly and discuss its implications when interpreting the results. The Methods should also provide information about the distance between ports for each species. I also strongly recommend including a few videos of rats and mice performing the task to have a sense of the movements involved in the task in each species.

      (1) DDM

      Goodness of fit:<br /> The authors conclude that the three species use an accumulation of evidence strategy because they can fit a DDM. However, there is little information about the goodness of these fits. They only show the RT distributions for one example subject (too small to distinguish whether the fit of the histograms is good or not). We suggest they make a figure showing in more detail the match of the RT distributions across subjects (e.g. they can compare RT quartiles for data and model for the entire group of subjects). Then they provide BIC which is a measure that depends on the number of trials. Were the number of trials matched across subjects/species? Could the authors provide a measure independent of the number of trials (e.g. cross-validated log-likelihood per trial)? Moreover, is this BIC computed only on the RTs, mouse responses, or both?

      Overparameterization:<br /> The authors chose to include as DDM parameters the variability of the initial offset, the variability in non-decision time, and the variability of the drift rate. Having so many parameters with just one stimulus condition (80:20 ratio of flashes) may lead to unidentifiability problems as recognized previously (e.g. see M. Jones (2021) here osf.io/preprints/psyarxiv/gja3u). Their parameter recovery Supplementary Figure 3 shows that at least two of these variability parameters can not be recovered. I also couldn't find the values of these parameters for the fitted DDM. So I was wondering the extent to which adding these parameters improves the fits and is overall necessary.

      Tachometric curves:<br /> The authors show increasing tachometric curves (i.e. Accuracy vs RT) and use this finding as proof of accumulation. They fit these curves using a GAAM with little justification or detail (in fact the GAAM seems to over-fit the data a bit). The authors do not say, however, that the other model used, i.e. the DDM, may not reproduce these increasing tachometric curves because "in its basic form", the DDM gives flat tachometric curves. Does the DDM fitted to the individual RT and choice data capture the monotonic increase observed in the tachometric curves?

      Correct vs Error trials:<br /> In a similar line, the authors do not test the fitted DDM separately in correct vs error trials, which is a classical distinction that most DDMs can't capture. It would be good to know if: (1) the RT in the data of correct vs error responses are similar (quantified in panel Figure 2B because in 2E it is not clear) and (2) the same trend between correct and error RTs are observed in the fitted DDMs.

      Urgency model:<br /> It is not clear how the urgency model used works. The authors cite Ditterich (2006), but in that paper, the urgency signal was applied to a race model with two decision variables: the urgency signal "accelerated" both DVs equally and sped up the race without favoring one DV versus the other. In a one-dimensional DDM, it is not clear where the urgency is applied. We assume it is applied in the direction of the stimulus, but then it is unclear how the urgency knows about the stimulus, which is what the DDM is trying to estimate in the first place. The authors should explain this model in greater detail and try to resolve this question.

      Despite finding differences between species, the analyses seem mostly exploratory instead of hypothesis-driven. There is little justification for why differences in some DDM parameters across species would be expected.

      (2) GLM and HMM

      The GLM fits show nicely that humans, rats, and mice weigh differently the total provided evidence (Figures 6C-D). This may be because the internal noise in the accumulation of evidence is higher but also it could simply be because animals do not weigh the evidence that is presented when they are already moving towards the side ports. A parsimonious alternative to the "more noisy" species is simply that they only consider the first part of the stimulus. Extending the GLM to capture the differential weighting of each sequential sample (what is called the Psychophysical kernel, PK) should be straightforward and would provide a more fair comparison between species (i.e. perhaps the slope of the psychometric curves is not that different, once evidence is weighted in each species with its corresponding PK.

      Choice Bias:<br /> Panel 3G (DDM starting point) shows that both rats and mice are slightly but systematically biased to the Left (x0 < 0.5). Panel 6D "Bias" seems to be showing the absolute value of the GLM bias parameter. It would be nice to (i) show the signed GLM bias parameter. (ii) Compare that the biases computed in the DDM and GLM are comparable across species and subjects; it looks like from the GLM they are comparable in magnitude across species whereas the in DDM they weren't (mice had a much bigger |x0| in the DDM), (iii) explain (or at least comment) on why animals show a systematic bias to one side.

    4. Reviewer #3 (Public review):

      Summary:

      This study directly compares decision-making strategies between three species, humans, rats, and mice. Based on a new and common behavioral task that is largely shared across species, specific features of evidence accumulation could be quantified and compared between species. The authors argue their work provides a framework to study decision-making across species, which can be studied by the same decision models. The authors report specific features of decision-making strategies, such as humans having a larger decision threshold leading to more accurate responses, and rodents deciding under time pressure.

      Strengths:

      The behavioral task is set up in similar, comparable ways across species, allowing for employing the same decision models and directly comparing specific features of decision behavior. This approach is compelling since it is otherwise challenging to compare behavior between species. Data analysis is solid and does not only quantify features of classic drift-diffusion models, but also additional commonly applied behavior models or features such as win-stay/lose-shift strategies, reward-maximization behavior, and slow, latent changes in behavior strategies. This approach reveals some interesting species differences, which are a starting point to investigate species-specific decision strategies more deeply and could inform a broad set of past and future behavior studies commonly used in cognitive and neuroscience.

      Weaknesses:

      (1) The choice of the stimulus difficulty is unclear, as choosing a single, specific evidence strength (80:20) could limit model fitting performance and interpretation of psychometric curves. This could also limit conclusions about species differences since the perceptual sensitivity seems quite different between species. Thus, the 80:20 lies at different uncertainty levels for the different species, which are known to influence behavioral strategies. This might be addressed by exploiting the distribution of actually delivered flashes, but it remained unclear to me to what degree this is the case. Previous perceptual discrimination studies typically sample multiple evidence levels to differentiate the source of variability in choice behavior.

      (2) The authors argue that their task is novel and that their task provides a framework to investigate perceptual decision-making. However, very similar, and potentially more powerful, perceptual decision-making tasks (e.g., using several evidence strength levels) have been used in humans, non-human primates, rats, mice, and other species. In some instances, analogous behavioral tasks, including studies using the same sensory stimulus, have been used across multiple species. While these may have been published in different papers, they have been conducted in some instances by the same lab and using the same analyses. Further, much of this work is not referenced here. This limits the impact of this work.

      (3) The employed drift-diffusion model has many parameters, which are not discussed in detail. Results in Supplementary Figures 3-5 are not explained or discussed, including the interpretation that model recovery tests fail to recover some of the parameters (eg, Figures S3E, G). This makes the interpretation of such models more difficult.

      (4) The results regarding potential reward-maximization strategies are compelling and connect perceptual and normative decision models. The results are however limited by the different inter-trial intervals and trial initiation times between species, which are shown in Figure S6. It's unclear to me how to interpret, for example, how the long trial initiation times in rats relate to a putative reward-maximizing strategy. This compares to the very low trial initiation times (ie, very 'efficient') of humans, even though they are 'too accurate' in terms of their sampling time. Reward-maximizing strategies seem difficult with such different trial times and in the absence of experimental manipulation.

    1. eLife Assessment

      In this important study, the authors use computational modeling to explore how rapid learning can be reconciled with the accumulation of stable memories in the olfactory bulb, where adult neurogenesis is prominent. They focus on the "flexibility-stability dilemma" and how it is resolved through local mechanisms within the olfactory bulb. These compelling results present a coherent picture of a neurogenesis-dependent learning process that aligns with diverse experimental observations and may serve as a foundation for further experimental and computational studies.

    2. Reviewer #1 (Public review):

      Summary:

      Sakelaris and Riecke used computational modeling to explore how neurogenesis and sequential integration of new neurons into a network support memory formation and maintenance. They focus on the integration of granule cells in the olfactory bulb, a brain area where adult neurogenesis is prominent. Experimental results published in recent years provide an excellent basis to address the question at hand by biologically constrained models. The study extends previous computational models and provides a coherent picture of how multiple processes may act in concert to enable rapid learning, high stability of memories, and high memory capacity. This computational model generates experimentally testable predictions and is likely to be valuable to understand the roles of neurogenesis and related phenomena in memory. One of the key findings is that important features of the memory system depend on transient properties of adult-born granule cells such as enhanced excitability and apoptosis during specific phases of the development of individual neurons. The model can explain many experimental observations and suggests specific functions for different processes (e.g., importance of apoptosis for continual learning). While this model is obviously a massive simplification of the biological system, it conceptualizes diverse experimental observations into a coherent picture, it generates testable predictions for experiments, and it will likely inspire further modeling and experimental studies. Nonetheless, there are issues that the authors should address.

      Strengths:

      (1) The model can explain diverse experimental observations.

      (2) The model directly represents the biological network.

      Weaknesses:

      As with many other models of biological networks, this model contains major simplifications.

    3. Reviewer #2 (Public review):

      Summary:

      This is an excellent paper that demonstrates Computational Modeling at its best. The authors propose a mechanism to provide flexibility to learn new information while preserving stability in neural networks by combining structural plasticity and synaptic plasticity.

      Strengths:

      An intriguing idea, that is well embedded in experimental data.

      The problem posed is real, the model uses data to be designed and implemented yet adds to the data novel and useful insight. The project proposes a parsimonious explanation for why neurogenesis can be better than classical plasticity and how stability versus flexibility can be solved with this approach.

      Weaknesses:

      No weaknesses were identified by this reviewer.

    4. Reviewer #3 (Public review):

      The manuscript is focused on local bulbar mechanisms to solve the flexibility-stability dilemma in contrast to long-range interactions documented in other systems (hippocampus-cortex). The network performance is assessed in a perceptual learning task: the network is presented with alternating, similar artificial stimuli (defined as enrichment) and the authors assess its ability to discriminate between these stimuli by comparing the mitral cell representations quantified by Fisher discriminant analysis. The authors use enhancement in discriminability between stimuli as a function of the degree of specificity of connectivity in the network to quantify the formation of an odor-specific network structure which as such has memory - they quantify memory as the specificity of that connectivity.

      The focus on neurogenesis, excitability, and synaptic connectivity of abGCs is topical, and the authors systematically built their model, clearly stating their assumptions and setting up the questions and answers. In my opinion, the combination of latent dendritic representations, excitability, and apoptosis in an age-dependent manner is interesting and as the authors point out leads to experimentally testable hypotheses. I have however several concerns with the novelty of the work, the lack of referencing of previous work on granule cells-mitral cell interactions more generally, and the biological plausibility of the model that, in my opinion, should be further addressed to better contextualize the model.

      (1) The authors find that a network with age-dependent synaptic plasticity outperforms one with constant age-independent plasticity and that having more GC per se is not sufficient to explain this effect. In addition, having an initial higher excitability of GCs leads to increased performance. To what degree the increased excitability of abGCs is conceptually necessarily independent of them having higher synaptic plasticity rates / fast synapses?

      (2) The authors do not mention previous theoretical work on the specificity of mitral to granule cell interactions from several groups (Koulakov & Rinberg - Neuron, 2011; Gilra & Bhalla, PLoSOne, 2015; Grabska-Bawinska...Mainen, Pouget, Latham, Nat. Neurosci. 2017; Tootoonian, Schaefer, Latham, PLoS Comput. Biol., 2022), nor work on the relevance of top-down feedback from the olfactory cortex on the abGC during odor discrimination tasks (Wu & Komiyama, Sci. Adv. 2020), or of top-down regulation from the olfactory cortex on regulating the activity of the mitral/tufted cells in task engaged mice (Lindeman et al., PLoS Comput. Biol., 2024), or in naïve mice that encounter odorants (in the absence of specific context; Boyd, et al., Cell Rep, 2015; Otazu et al., Neuron 2015, Chae et al., Neuron, 2022). In particular, the presence of rich top-down control of granule cell activity (including of abGCs) puts into question the plausibility of one of the opening statements of the authors with respect to relying solely on local circuit mechanisms to solve the flexibility-stability dilemma. I think the discussion of this work is important in order to put into context the idea of specific interactions between the abGCs and the mitral cells.

      (3) To what the degree of specific connectivity reflects a specific stimulus configuration, and is a good proxy for determining the stimulus discriminability and memory capacity in terms of temporal activity patterns (difference in latency/phase with respect to the respiration cycle, etc.) which may account to a substantial fraction of ability to discriminate between stimuli? The authors mention in the discussion that this is, indeed, an upper bound and specific connectivity is necessary for different temporal activity patterns, but a further expansion on this topic would help in understanding the limitations of the model.

      (4) Reward or reward prediction error signals are not considered in the model. They however are ubiquitous in nature and likely to be encountered and shape the connectivity and activity patterns of the abGC-mitral cell network. Including a discussion of how the model may be adjusted to incorporate reward/error signals would strengthen the manuscript.

      Specific Comments

      (1) Lines 84-86; 507-509; Eq(3): Sensory input is defined by a basal parameter of MCs spontaneous activity (Sspontaneus) and the odor stimuli input (Siodor) but is not clear from the main text or methods how sensory inputs (glomerular patterns) were modeled.

      (2) Lines 118-122: The used perceptual learning task explanation is done only in the context of the discriminability of similar artificial stimuli using the Fisher discriminant and "Memory" metric. A detailed description of the logic of the perceptual learning task methods and objective, taking into account Comment 1, would help to better understand the model.

      (3) Rapid re-learning of forgotten odor pair is enabled by sensory-dependent dendritic elaboration of neurons that initially encoded the odors and the observed re-learning would occur even if neurogenesis was blocked following the first enrichment and even though the initial learning did require neurogenesis. When this would ever occur in nature? The re-learning of an odor period? Why is this highlighted in the study?

    1. eLife Assessment

      This study presents valuable findings related to seasonal brain size plasticity in the Eurasian common shrew (Sorex araneus), which is an excellent model system for these studies. The evidence supporting the authors' claims is convincing. The work will be of interest to biologists working on neuroscience, plasticity, and evolution.

    2. Reviewer #1 (Public review):

      Summary:

      In this paper, Thomas et al. set out to study seasonal brain gene expression changes in the Eurasian common shrew. This mammalian species is unusual in that it does not hibernate or migrate but instead stays active all winter while shrinking and then regrowing its brain and other organs. The authors previously examined gene expression changes in two brain regions and the liver. Here, they added data from the hypothalamus, a brain region involved in the regulation of metabolism and homeostasis. The specific goals were to identify genes and gene groups that change expression with the seasons and to identify genes with unusual expression compared to other mammalian species. The reason for this second goal is that genes that change with the season could be due to plastic gene regulation, where the organism simply reacts to environmental change using processes available to all mammals. Such changes are not necessarily indicative of adaptation in the shrew. However, if the same genes are also expression outliers compared to other species that do not show this overwintering strategy, it is more likely that they reflect adaptive changes that contribute to the shrew's unique traits.

      The authors succeeded in implementing their experimental design and identified significant genes in each of their specific goals. There was an overlap between these gene lists. The authors provide extensive discussion of the genes they found.

      The scope of this paper is quite narrow, as it adds gene expression data for only one additional tissue compared to the authors' previous work in a 2023 preprint. The two papers even use the same animals, which had been collected for that earlier work. As a consequence, the current paper is limited in the results it can present. This is somewhat compensated by an expansive interpretation of the results in the discussion section, but I felt that much of this was too speculative. More importantly, there are several limitations to the design, making it hard to draw stronger conclusions from the data. The main contribution of this work lies in the generated data and the formulation of hypotheses to be tested by future work.

      Strengths:

      The unique biological model system under study is fascinating. The data were collected in a technically sound manner, and the analyses were done well. The paper is overall very clear, well-written, and easy to follow. It does a thorough job of exploring patterns and enrichments in the various gene sets that are identified.

      I specifically applaud the authors for doing a functional follow-up experiment on one of the differentially expressed genes (BCL2L1), even if the results did not support the hypothesis. It is important to report experiments like this and it is terrific to see it done here.

      Comments on revised version:

      This updated version of the paper is improved compared to its initial version. As such, the strengths remain the same as before, with a fascinating model system and an interesting research question. The earlier weaknesses related to overinterpretation of the data have been largely fixed by shortening the paper and adding appropriate caveats throughout. The paper now also includes a significance test for its overlap between gene lists. While this turned out to be negative (i.e., there is not more overlap between lists than expected by chance), reporting this result transparently has strengthened the paper.

    3. Reviewer #2 (Public review):

      Summary:

      Shrews go through winter by shrinking their brain and most organs, then regrow them in the spring. The gene expression changes underlying this unusual brain size plasticity were unknown. Here, the authors looked for potential adaptations underlying this trait by looking at differential expression in the hypothalamus. They found enrichments for DE in genes related to the blood brain barrier and calcium signaling, as well as used comparative data to look at gene expression differences that are unique in shrews. This study leverages a fascinating organismal trait to understand plasticity and what might be driving it at the level of gene expression. This manuscript also lays the groundwork for further developing this interesting system.

      Strengths:

      One strength is that the authors used OU models to look for adaptation in gene expression. The authors also added cell culture work to bolster their findings.

      Comments on revised version:

      I think that the authors have made a strong revision. No other comments.

    4. Reviewer #3 (Public review):

      Summary:

      In their study, the authors combine seasonal and comparative transcriptomics to identify candidate genes with plastic, canalized, or lineage-specific (i.e., divergent) expression patterns associated with an unusual overwintering phenomenon (Dehnel's phenomenon - seasonal size plasticity) in the Eurasian shrew. Their focus is on the shrinkage and regrowth of the hypothalamus, a brain region that undergoes significant seasonal size changes in shrews and plays a key role in regulating metabolic homeostasis. Through comparative transcriptomic analysis, they identify genes showing derived (lineage-specific), plastic (seasonally regulated), and canalized (both lineage-specific and plastic) expression patterns. The authors hypothesize that genes involved in pathways such as the blood-brain barrier, metabolic state sensing, and ion-dependent signaling will be enriched among those with notable transcriptomic patterns. They complement their transcriptomic findings with a cell culture-based functional assessment of a candidate gene believed to reduce apoptosis.

      Strengths:

      The study's rationale and its integration of seasonal and comparative transcriptomics are well-articulated and represent an advancement in the field. The transcriptome, known for its dynamic and plastic nature, is also influenced by evolutionary history. The authors effectively demonstrate how multiple signals-evolutionary, constitutive, and plastic-can be extracted, quantified, and interpreted. The chosen phenotype and study system are particularly compelling, as it not only exemplifies an extreme case of Dehnel's phenotype, but the metabolic requirements of the shrew suggest that genes regulating metabolic homeostasis are under strong selection.

      Weaknesses:

      The results of the expression patterns are quite compelling and a number of interesting downstream hypotheses are outlined; however, the interpretation of the role of each gene and pathway identified is speculative which dampens the overall impact of the work. That said, I commend the authors on functionally testing one of the differentially expressed genes. I also commend the inclusion of that negative result.

    5. Author response:

      The following is the authors’ response to the original reviews

      eLife Assessment

      This study presents valuable findings related to seasonal brain size plasticity in the Eurasian common shrew (Sorex araneus), which is an excellent model system for these studies. The evidence supporting the authors' claims is convincing. However, the authors should be careful when applying the term adaptive to the gene expression changes they observe; it would be challenging to demonstrate the differential fitness effects of these gene expression changes. The work will be of interest to biologists working on neuroscience, plasticity, and evolution.

      We appreciate the reviewers’ suggestions and comments. For the phylogenetic ANOVA we used (EVE), which tests for a separate RNA expression optimum specific to the shrew lineage consistent with expectations for adaptive evolution of gene expression. But, as you noted, while this analysis highlights many candidate genes evolving in a manner consistent with positive selection, further functional validation is required to confirm if and how these genes contribute to Dehnel’s phenomenon. In the discussion, we now emphasize that inferred adaptive expression of these genes is putative and outline that future studies are needed to test the function of proposed adaptations. For example, cell line validations of BCL2L1 on apoptosis is a case study that tests the function of a putatively adaptive change in gene expression, and it illuminates this limitation. We also have refined our discussion to focus more on pathway-level analyses rather than on individual genes, and have addressed other issues presented, including clarity of methods and using sex as a covariate in our analyses.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      In this paper, Thomas et al. set out to study seasonal brain gene expression changes in the Eurasian common shrew. This mammalian species is unusual in that it does not hibernate or migrate but instead stays active all winter while shrinking and then regrowing its brain and other organs. The authors previously examined gene expression changes in two brain regions and the liver. Here, they added data from the hypothalamus, a brain region involved in the regulation of metabolism and homeostasis. The specific goals were to identify genes and gene groups that change expression with the seasons and to identify genes with unusual expression compared to other mammalian species. The reason for this second goal is that genes that change with the season could be due to plastic gene regulation, where the organism simply reacts to environmental change using processes available to all mammals. Such changes are not necessarily indicative of adaptation in the shrew. However, if the same genes are also expression outliers compared to other species that do not show this overwintering strategy, it is more likely that they reflect adaptive changes that contribute to the shrew's unique traits.

      The authors succeeded in implementing their experimental design and identified significant genes in each of their specific goals. There was an overlap between these gene lists. The authors provide extensive discussion of the genes they found.

      The scope of this paper is quite narrow, as it adds gene expression data for only one additional tissue compared to the authors' previous work in a 2023 preprint. The two papers even use the same animals, which had been collected for that earlier work. As a consequence, the current paper is limited in the results it can present. This is somewhat compensated by an expansive interpretation of the results in the discussion section, but I felt that much of this was too speculative. More importantly, there are several limitations to the design, making it hard to draw stronger conclusions from the data. The main contribution of this work lies in the generated data and the formulation of hypotheses to be tested by future work.

      Thank you for your interest in our manuscript and for your insights. We addressed your comments below: we now highlight the limitations of our study design in the discussion and emphasize that, while a second optimum of gene expression in shrews is consistent with adaptive evolution, we recognize that not all sources of variation in gene expression can be fully accounted for. We highlight the putative nature of these results in our revisions, especially in our new limitations section (lines 541-555).

      Strengths:

      The unique biological model system under study is fascinating. The data were collected in a technically sound manner, and the analyses were done well. The paper is overall very clear, well-written, and easy to follow. It does a thorough job of exploring patterns and enrichments in the various gene sets that are identified.

      I specifically applaud the authors for doing a functional follow-up experiment on one of the differentially expressed genes (BCL2L1), even if the results did not support the hypothesis. It is important to report experiments like this and it is terrific to see it done here.

      We are glad to hear that you found our manuscript fascinating and clearly written. While we hoped to see an effect of BCL2L1 on apoptosis as proposed, we agree that reporting null results is valuable when validating evolutionary inferences.

      Weaknesses:

      While the paper successfully identifies differentially expressed seasonal genes, the real question is (as explained by the authors) whether these are evolved adaptations in the shrews or whether they reflect plastic changes that also exist in other species. This question was the motivation for the inter-species analyses in the paper, but in my view, these cannot rigorously address this question. Presumably, the data from the other species were not collected in comparable environments as those experienced by the shrews studied here. Instead, they likely (it is not specified, and might not be knowable for the public data) reflect baseline gene expression. To see why this is problematic, consider this analogy: if we were to compare gene expression in the immune system of an individual undergoing an acute infection to other, uninfected individuals, we would see many, strong expression differences. However, it would not be appropriate to claim that the infected individual has unique features - the relevant physiological changes are simply not triggered in the other individuals. The same applies here: it is hard to draw conclusions from seasonal expression data in the shrews to non-seasonal data in the other species, as shrew outlier genes might still reflect physiological changes that weren't active in the other species.

      There is no solution for this design flaw given the public data available to the authors except for creating matched data in the other species, which is of course not feasible. The authors should acknowledge and discuss this shortcoming in the paper.

      Thank you for taking the time to provide such insightful feedback. As you noted, whiles shrews experience seasonal size changes, their environments may differ from the other species used in this experiment, leading to increased or decreased expression of certain genes and reducing our ability accurately detect selection across the phylogeny. Although we sought to control for as many sources of variation as possible, such as using only post-pubescent, wild, or non-domesticated individuals when feasible, we recognize that not all sources of variation can be fully accounted for within a practical experiment. We agree that these sources of variation can introduce both false positives and negatives into our results, and we have now highlighted this limitation within our discussion (lines 538-552).

      Related to the point above: in the section "Evolutionary Divergence in Expression" it is not clear which of the shrew samples were used. Was it all of them, or only those from winter, fall, etc? One might expect different results depending on this. E.g., there could be fewer genes with inferred adaptive change when using only summer samples. The authors should specify which samples were included in these analyses, and, if all samples were used, conduct a robustness analysis to see which of their detected genes survive the exclusion of certain time points.

      Thank you for this attention to detail. We used spring adults for this analysis. This decision was made as only used post pubescent individuals for all species in the analysis, and this was the only season where adult shrews were going through Dehnel’s phenomenon. We have now clarified this in both the methods and results (line 247 and line 667)

      In the same section, were there also genes with lower shrew expression? None are mentioned in the text, so did the authors not test for this direction, or did they test and there were no significant hits?

      We did test for decreased shrew expression compared to the rest of the species, but there were no significant genes with significant decreases. We hypothesize that there are two potential reasons for this results; 1) If a gene were to be selected for decreased expression, selection for constitutive expression of the gene across all species may be weak, and thus found in other lineages as well, or 2) decreased or no expression may relax selection on the coding regions, and thus these genes are not pulled out as we identify 1:1 orthologs. This is consistent with results provided from the original methods manuscript. Thank you for pointing out that we did not discuss this information in the text, and we now include it in our results (lines 250-251).

      The Discussion is too long and detailed, given that it can ultimately only speculate about what the various expression changes might mean. Many of the specific points made (e.g. about the blood-brain-barrier being more permissive to sensing metabolic state, about cross-organ communication, the paragraphs on single, specific genes) are a stretch based on the available data. Illustrating this point, the one follow-up experiment the authors did (on BCL2L1) did not give the expected result. I really applaud the authors for having done this experiment, which goes beyond typical studies in this space. At the same time, its result highlights the dangers of reading too much into differential expression analyses.

      We agree with your point, while our extensive discussion is useful for testing future hypotheses, ultimately some of the discussion may be too speculative for our readers. To amend this, we have reduced some portions of our discussion and focused more on pathways than individual genes, including removing mechanisms related to HRH2, FAM57B, GPR3, and GABAergic neurons. We hope that this highlights to the reader the speculative nature of many of our results.

      There is no test of whether the five genes observed in both analyses (seasonal change and inter-species) exceed the number expected by chance. When two gene sets are drawn at random, some overlap is expected randomly. The expected overlap can be computed by repeated draws of pairs of random sets of the same size as seen in real data and by noting the overlap between the random pairs. If this random distribution often includes sets of five genes, this weakens the conclusions that can be drawn from the genes observed in the real data.

      Thank you for highlighting this approach, it is greatly needed. After running this test, we found that observed overlapping genes were more than the expected overlap, yet not significant. We now show this in our methods (lines 277-278) and results (lines 719-720).

      Reviewer #2 (Public review):

      Summary:

      Shrews go through winter by shrinking their brain and most organs, then regrow them in the spring. The gene expression changes underlying this unusual brain size plasticity were unknown. Here, the authors looked for potential adaptations underlying this trait by looking at differential expression in the hypothalamus. They found enrichments for DE in genes related to the blood-brain barrier and calcium signaling, as well as used comparative data to look at gene expression differences that are unique in shrews. This study leverages a fascinating organismal trait to understand plasticity and what might be driving it at the level of gene expression. This manuscript also lays the groundwork for further developing this interesting system.

      We are glad you found our manuscript interesting and thank and thank you for your feedback. We hope that we have addressed all of your concerns as described below.

      Strengths:

      One strength is that the authors used OU models to look for adaptation in gene expression. The authors also added cell culture work to bolster their findings.

      Weaknesses:

      I think that there should be a bit more of an introduction to Dehnel's phenomenon, given how much it is used throughout.

      Thank you for this insight. With a lengthy introduction and discussion, we agree that the importance of Dehnel’s phenomenon may have been overshadowed. We have shortened both sections and emphasized the background on Dehnel’s phenomenon in the first two paragraphs of the introduction, allowing this extraordinary seasonal size plasticity to stand out.

      Reviewer #3 (Public review):

      Summary:

      In their study, the authors combine developmental and comparative transcriptomics to identify candidate genes with plastic, canalized, or lineage-specific (i.e., divergent) expression patterns associated with an unusual overwintering phenomenon (Dehnel's phenomenon - seasonal size plasticity) in the Eurasian shrew. Their focus is on the shrinkage and regrowth of the hypothalamus, a brain region that undergoes significant seasonal size changes in shrews and plays a key role in regulating metabolic homeostasis. Through combined transcriptomic analysis, they identify genes showing derived (lineage-specific), plastic (seasonally regulated), and canalized (both lineage-specific and plastic) expression patterns. The authors hypothesize that genes involved in pathways such as the blood-brain barrier, metabolic state sensing, and ion-dependent signaling will be enriched among those with notable transcriptomic patterns. They complement their transcriptomic findings with a cell culture-based functional assessment of a candidate gene believed to reduce apoptosis.

      Strengths:

      The study's rationale and its integration of developmental and comparative transcriptomics are well-articulated and represent an advancement in the field. The transcriptome, known for its dynamic and plastic nature, is also influenced by evolutionary history. The authors effectively demonstrate how multiple signals-evolutionary, constitutive, and plastic-can be extracted, quantified, and interpreted. The chosen phenotype and study system are particularly compelling, as it not only exemplifies an extreme case of Dehnel's phenotype, but the metabolic requirements of the shrew suggest that genes regulating metabolic homeostasis are under strong selection.

      Weaknesses:

      (1) In a number of places (described in detail below), the motivation for the experimental, analytical, or visualization approach is unclear and may obscure or prevent discoveries.

      Thank you for finding our research and manuscript compelling, as well as the valuable feedback that will drastically improve our manuscript. We hope that we have alleviated your concerns below by following your instructions below.

      (2) Temporal Expression - Figure 1 and Supplemental Figure 2 and associated text:

      - It is unclear whether quantitative criteria were used to distinguish "developmental shift" clusters from "season shift" clusters. A visual inspection of Supplemental Figure 2 suggests that some clusters (e.g., clusters 2, 8, and to a lesser extent 12) show seasonal variation, not just developmental differences between stages 1 and 2. While clustering helps to visualize expression patterns, it may not be the most appropriate filter in this case, particularly since all "season shift" clusters are later combined in KEGG pathway and GO analyses (Figure 1B).

      - The authors do not indicate whether they perform cluster-specific GO or KEGG pathway enrichment analyses. The current analysis picks up relevant pathways for hypothalamic control of homeostasis, which is a useful validation, but this approach might not fully address the study's key hypotheses.

      Thank you for this valuable feedback. We did not want to include clusters we deemed to be related to development, as this should not be attributed to changes associated with Dehnel’s phenomenon. We did this through qualitative, visual inspection, which we realize can differ between parties (i.e., clusters 2, 8, and 12 appeared to be seasonal). Qualitatively, we were looking for extreme divergence between Stage 1 and Stage 5 individuals, as expression was related to season and not development, then the average of these stages within cluster should be relatively similar. We have now quantified this as large differences in z-score (abs(summer juvenile-summer adult)>1.25) without meaningful interseason variations determined by a second local maximum (abs(autumn-winter)<0.5 and abs(winter-summer)<0.5)), and added it both our methods (lines 699-702) and results (line 192).

      Regarding the combination of clusters for pathway enrichment compared to individual pathways, we agree that combining clusters may be more informative for overall homeostasis, compared to individual clusters which may inform us on processes directly related to Dehnel’s phenomenon. Initially, we were tentative to conduct this analysis, as clusters contain small gene sets, reducing the ability to detect pathway enrichments. We have now included this analysis, which is reported in our methods (lines 703-704), results (lines 203-204)., and new supplemental table.

      (3) Differential expression between shrinkage (stage 2) and regrowth (stage 4) and cell culture targets

      - The rationale for selecting BCL2L1 for cell culture experiments should be clarified. While it is part of the apoptosis pathway, several other apoptosis-related genes were identified in the differential gene expression (DGE) analysis, some showing stronger differential expression or shrew-specific branch shifts. Why was BCL2L1 prioritized over these other candidates?

      We agree that our rationale for validating BCL2L1 function in neural cell lines was not clearly explained in the manuscript. We selected BCL2L1 because it is the furthest downstream gene in the apoptotic pathway, thus making it the most directly involved gene in programmed cell death, whereas upstream genes could influence additional genes or alternative processes. We have clarified this choice in the revised methods section (lines 748-750).

      - The authors mention maintaining (or at least attempting to maintain) a 1:1 sex ratio for the comparative analysis, but it is unclear if this was also done for the S. araneus analysis. If not, why? If so, was sex included as a covariate (e.g., a random effect) in the differential expression analysis? Sex-specific expression elevates with group variation and could impact the discovery of differentially expressed genes.

      Regarding the use of sex as a covariate, we acknowledge the concerns raised. In our evolutionary analyses, we maintained a balanced sex ratio within species when possible. EVE models handle the effect of sex on gene expression as intraspecific variation. In shrews, however, we used males exclusively, as females were only found among juvenile individuals. Including those juvenile females would have introduced age effects, with perhaps a larger effect on our results. For the seasonal data, we have now included sex as a covariate in differential expression analyses. However, our design is imbalanced in relation to sex, which we have now discussed in our methods (lines 713-714) and discussion limitations (lines 544-548).

      (4) Discussion: The term "adaptive" is used frequently and liberally throughout the discussion. The interpretation of seasonal changes in gene expression as indicators of adaptive evolution should be done cautiously as such changes do not necessarily imply causal or adaptive associations.

      Thank you for this insight. We have reviewed our discussion and clarified that adaptations are putative (i.e. lines 146, 285, and 332), and highlighted this in our limitations section.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      (1) I would recommend always spelling out "Dehnel's phenomenon" or even replacing this term (after crediting the DP term) with the more informative "seasonal size plasticity". Every time I saw "DP", I had to remind myself what this referred to. If the authors choose not to do so, please use the acronym consistently (e.g. line 186 has it spelled out).

      We have replaced the acronym DP with either the full term or the more informative “seasonal size plasticity” throughout the text.

      (2) Line 202: "DEG" has not been defined. Simply add to the line before.

      Thank you for this attention to detail. We have added this to the line above (210).

      (3) Please add a reference for the "AnAge" tool that was used to determine if samples were pubescent.

      Thank you for identifying this oversight. We have now cited the proper paper in line 634.

      (4) In the BCL2L1 section in the results, add a callout to Figure 2D.

      We have now added a callout to Figure 2D within the results (line 234).

      Reviewer #2 (Recommendations for the authors):

      (1) Line 122: is associated? These adaptations?

      Thank you for identifying that we were missing the words “associated with” here. We have fixed this in the revision.

      (2) The first paragraph of the Results should be moved to the methods, except maybe the number of orthologs.

      Thank you for this insight. We have removed this portion from the results section.

      (3) Why a Bonferroni correction on line 188? That seems too strict.

      We agree the Bonferroni correction is strict. Results when using other less strict methods for controlling false discovery rate are also not significant after correction. These corrections can be found within the data, however, we only report on the Bonferroni correction.

      (4) Line 427: "is a novel candidate gene for several neurological disorders" needs some references. I see them a couple of sentences later, but that's quite a sentence with no references at the end.

      We have added the proper citations for this sentence (line 524).

      Reviewer #3 (Recommendations for the authors):

      (1) Temporal Expression - Figure 1 and Supplemental Figure 2 and associated text Line176-193:

      - The authors report the total number of genes meeting inclusion criteria (>0.5-fold change between any two stages and 2 samples >10 normalized reads), but it would be more informative to also provide the number of genes within each temporal cluster. This would offer a clearer understanding of how gene expression patterns are distributed over time.

      Unfortunately, this information is difficult to depict on our figure and would use too much space in the text. We have thus added a description of the range of genes in a new supplemental table depicting this information.

      - It is unclear whether quantitative criteria were used to distinguish "developmental shift" clusters from "season shift" clusters. A visual inspection of Supplemental Figure 2 suggests that some clusters (e.g., clusters 2, 8, and to a lesser extent 12) show seasonal variation, not just developmental differences between stages 1 and 2. While clustering helps to visualize expression patterns, it may not be the most appropriate filter in this case, particularly since all "season shift" clusters are later combined in KEGG pathway and GO analyses (Fig. 1B). Using a differential gene expression criterion might be more suitable. For example, do excluded genes show significant log-fold differences between late-stage comparisons?

      As previously mentioned, we have now quantified seasonal shifts as large differences in z-score (abs(summer juveniles-summer adults)>1.25) without meaningful interseason variations determined by a second local maximum (abs(autumn-winter)<0.5 and abs(winter-summer)<0.5)), and added it to our methods (lines 699-702).  We then follow this up with differential expression analyses as described in Figure 2.

      - Did the authors perform cluster-specific GO or KEGG pathway enrichment analyses instead of focusing on the combined set of genes across the season shift clusters? While I understand that the small number of genes in each cluster may be limiting, if pathways emerge from cluster-specific analysis, they could provide more detailed insights into the functional significance of these temporal expression patterns. The current analysis picks up relevant pathways for hypothalamic control of homeostasis, which is a useful validation, but this approach might not fully address the study's key hypotheses. Additionally, no corrections for multiple hypothesis testing were applied, as noted in the results. A more refined gene set (e.g., using differential expression criteria, described above) could be more appropriate for these analyses.

      We have now included cluster-specific KEGG enrichments as previously described.

      (2) Differential expression between shrinkage (stage 2) and regrowth (stage 4) and cell culture targets - Figure 2 and lines195-227:

      - The rationale for selecting BCL2L1 for cell culture experiments should be clarified. While it is part of the apoptosis pathway, several other apoptosis-related genes were identified in the differential gene expression (DGE) analysis, some showing stronger differential expression or shrew-specific branch shifts. Why was BCL2L1 prioritized over these other candidates?

      We have now included the reasoning for further validation of BCL2L1 as described above.

      - The relevance of the "higher degree" differentially expressed genes needs more explanation. Although this group of genes is highlighted in the results, they are not featured in any subsequent analyses, leaving their importance unclear.

      Thank you for this insight. We have removed this from the methods as it is not relevant to subsequent analyses or conclusions.

      - The authors mention maintaining (or at least attempting to maintain) a 1:1 sex ratio for the comparative analysis (Line 525), but it is unclear if this was also done for the S. araneus analysis. If so, was sex included as a covariate (e.g., a random effect) in the differential expression analysis?

      We have now incorporated information on sex as described above.

      (3) Discussion:

      The term "adaptive" is used frequently and liberally throughout the discussion, but the authors should be cautious in interpreting seasonal changes in gene expression as indicators of adaptive evolution. Such changes do not necessarily imply causal or adaptive associations, and this distinction should be clearly stated when discussing the results.

      Thank you for this feedback and we agree with your conclusion, while a second expression optimum in the shrew lineage is indicative of adaptive expression, we cannot fully determine whether these are caused by genetic or environmental factors, despite careful attention to experimental design. We have highlighted this as a limitation in the discussion.

      (4) Minor Editorial Comment:

      Line 105: "... maintenance of an energy budgets..." delete "an"

      We have removed this grammatical error.

    1. eLife Assessment

      The manuscript by de La Forest Divonne et al. offers an important and detailed exploration of the immune cells in the oyster Crassostrea gigas, by correlating distinct hemocyte morphotypes with specific single-cell transcriptional profiles. The evidence supporting the conclusion is convincing, deriving from the comprehensive dataset that not only captures unicellular diversity but also associates these cells with distinct immune roles, making it an invaluable resource for the broader research community.

    2. Reviewer #1 (Public review):

      Summary

      In this manuscript, De La Forest Divonne et al. build a repertory of hemocytes from adult Pacific oysters combining scRNAseq data with cytologic and biochemical analyses. Three categories of hemocytes were described previously in this species (i.e. blast, hyalinocyte and granulocytes). Based on scRNAseq data, the authors identified 7 hemocyte clusters presenting distinct transcriptional signatures. Using Kegg pathway enrichment and RBGOA, the authors determined the main molecular features of the clusters. In parallel, using cytologic markers, the authors classified 7 populations of hemocytes (i.e. ML, H, BBL, ABL, SGC, BGC, and VC) presenting distinct sizes, nucleus sizes, acidophilic/basophilic, presence of pseudopods, cytoplasm/nucleus ratio and presence of granules. Then, the authors compared the phenotypic features with potential transcriptional signatures seen in the scRNAseq. The hemocytes were separated in a density gradient to enrich for specific subpopulations. The cell composition of each cell fraction was determined using cytologic markers and the cell fractions were analysed by quantitative PCR targeting major cluster markers (two per cluster). With this approach, the authors could assign cluster 7 to VC, cluster 2 to H, and cluster 3 to SGC. The other clusters did not show a clear association with this experimental approach. Using phagocytic assays, ROS, and copper monitoring, the authors showed that ML and SGC are phagocytic, ML produces ROS, and SGC and BGC accumulate copper. Then with the density gradient/qPCR approach, the authors identified the populations expressing anti-microbial peptides (ABL, BBL, and H). At last, the authors used Monocle to predict differentiation trajectories for each subgroup of hemocytes using cluster 4 as the progenitor subpopulation.

      The manuscript provides a comprehensive characterisation of the diversity of circulating immune cells found in Pacific oysters.

      Strengths

      The combination of scRNAseq, cytologic markers and gradient based hemocyte sorting offers an integrative view of the immune cell diversity.<br /> Hemocytes represent a very plastic cell population that has key roles in homeostatic and challenged conditions. Grasping the molecular features of these cells at the single-cell level will help understand their biology.<br /> This type of study may help elucidate the diversification of immune cells in comparative studies and evolutionary immunology.

      Weaknesses

      Several figures show inconsistency leading to erroneous conclusions and some conclusions are poorly supported. Moreover, the manuscript remains highly descriptive with limited comparison with the available literature.

      Comments on revisions:

      The authors replied to most comments.

    1. eLife Assessment

      This important study shows how the relative importance of inter-species interactions in microbiomes can be inferred from empirical species abundance data. The methods based on statistical physics of disordered systems are convincing and rigorous, and allow for distinguishing healthy and non-healthy human gut microbiomes via differences in their inter-species interaction patterns. This work should be of broad interest to researchers in microbial ecology and theoretical biophysics.

    2. Reviewer #1 (Public review):

      Summary:

      In this manuscript, the authors develop a novel method to infer ecologically-informative parameters across healthy and diseased states of the gut microbiota, although the method is generalizable to other datasets for species abundances. The authors leverage techniques from theoretical physics of disordered systems to infer different parameters - mean and standard deviation for the strength of bacterial interspecies interactions, a bacterial immigration rate, and the strength of demographic noise - that describe the statistics of microbiota samples from two groups-one for healthy subjects and another one for subjects with chronic inflammation syndromes. To do this, the authors simulate communities with a modified version of the Generalized Lotka-Volterra model and randomly-generated interactions, and then use a moment-matching algorithm to find sets of parameters that better reproduce the data for species abundances. They find that these parameters are different for the healthy and diseased microbiota groups. The results suggest, for example, that bacterial interaction strengths, relative to noise and immigration, are more dominant for microbiota dynamics in diseased states than in healthy states.

      We think that this manuscript brings an important contribution that will be of interest in the areas of statistical physics, (microbiota) ecology, and (biological) data science. The evidence of their results is solid and the work improves the state-of-the-art in terms of methods. There are a few weaknesses that, in our opinion, the authors could address to further improve the work.

      Strengths:

      (1) Using a fairly generic ecological model, the method can identify the change in the relative importance of different ecological forces (distribution of interspecies interactions, demographic noise, and immigration) in different sample groups. The authors focus on the case of the human gut microbiota, showing that the data are consistent with a higher influence of species interactions (relative to demographic noise and immigration) in a disease microbiota state than in healthy ones.

      (2) The method is novel, original, and it improves the state-of-the-art methodology for the inference of ecologically relevant parameters. The analysis provides solid evidence for the conclusions.

      Weaknesses:

      In the way it is written, this work might be mostly read by physicists. We believe that, with some rewriting, the authors could better highlight the ecological implications of the results and make the method more accessible to a broader audience.

    3. Reviewer #2 (Public review):

      Summary:

      This valuable work aims to infer, from microbiome data, microbial species interaction patterns associated with healthy and unhealthy human gut microbiomes. Using solid techniques from statistical physics, the authors propose that healthy and unhealthy microbiome interaction patterns substantially differ. Unhealthy microbiomes are closer to instability and single-strain dominance; whereas healthy microbiomes showcase near-neutral dynamics, mostly driven by demographic noise and immigration.

      Strengths:

      A well-written article, relatively easy to follow and transparent despite the high degree of technicality of the underlying theory. The authors provide a powerful inferring procedure, which bypasses the issue of having only compositional data.

      Weaknesses:

      (1) This sentence in the introduction seems key to me: "Focusing on single species properties as species abundance distribution (SAD), fail to characterise altered states of microbiome." Yet it is not explained what is meant by 'fail', and thus what the proposed approach 'solves'.

      (2) Lack of validation, following arbitrary modelling choices made (symmetry of interactions, weak-interaction limit, uniform carrying capacity).<br /> Inconsistent interpretation of instability. Here, instability is associated with the transition to the marginal phase, which becomes chaotic when interaction symmetry is broken. But as the authors acknowledge, the weak interaction limit does not reproduce fat-tailed abundance distributions found in data. On the other hand, strong interaction regimes, where chaos prevails, tend to do so (Mallmin et al, PNAS 2024). Thus, the nature of the instability towards which unhealthy microbiomes approach is unclear.

      (3) Three technical points about the methodology and interpretation.<br /> a) How can order parameters h and q0 can be inferred, if in the compositional data they are fixed by definition?<br /> b) How is it possible that weaker interaction variance is associated with approach to instability, when the opposite is usually true?<br /> c) Having an idea of what the empirical data compares to the theoretical fits would be valuable.

      Implications:

      As the authors say, this is a proof of concept. They point at limits and ways to go forward, in particular pointing at ways in which species abundance distributions could be better reproduced by the predicted dynamical models. One implication that is missing, in my opinion, is the interpretability of the results, and what this work achieves that was missing from other approaches (see weaknesses section above): what do we learn from the fact that changes in microbial interactions characterise healthy from unhealthy microbiota? For instance, what does this mean for medical research?

    4. Reviewer #3 (Public review):

      Summary:

      I found the manuscript to be well-written. I have a few questions regarding the model, though the bulk of my comments are requests to provide definitions and additional clarity. There are concepts and approaches used in this manuscript that are clear boons for understanding the ecology of microbiomes but are rarely considered by researchers approaching the manuscript from a traditional biology background. The authors have clearly considered this in their writing of S1 and S2, so addressing these comments should be straightforward. The methods section is particularly informative and well-written, with sufficient explanations of each step of the derivation that should be informative to researchers in the microbial life sciences who are not well-versed with physics-inspired approaches to ecology dynamics.

      Strengths:

      The modeling efforts of this study primarily rely on a disordered form of the generalized Lotka-Volterra (gLV) model. This model can be appropriate for investigating certain systems, and the authors are clear about when and how more mechanistic models (i.e., consumer-resource) can lead to gLV. Phenomenological models such as this have been found to be highly useful for investigating the ecology of microbiomes, so this modeling choice seems justified, and the limitations are laid out.

      Weaknesses:

      The authors use metagenomic data of diseased and healthy patients that were first processed in Pasqualini et al. (2024). The use of metagenomic data leads me to a question regarding the role of sampling effort (i.e., read counts) in shaping model parameters such as $h$. This parameter is equal to the average of 1/# species across samples because the data are compositional in nature. My understanding is that $h$ was calculated using total abundances (i.e., read counts). The number of observed species is strongly influenced by sampling effort, so it would be useful if the number of reads were plotted against the number of species for healthy and diseased subjects.

      However, the role of sampling effort can depend on the type of data, and my instinct about the role that sampling effort plays in species detection is primarily based on 16S data. The dependency between these two variables may be less severe for the authors' metagenomic pipeline. This potential discrepancy raises a broader issue regarding the investigation of microbial macroecological patterns and the inference of ecological parameters. Often microbial macroecology researchers rely on 16S rRNA amplicon data because that type of data is abundant and comparatively low-cost. Some in microbiology and bioinformatics are increasingly pushing researchers to choose metagenomics over 16S. Sometimes this choice is valid (discovery of new MAGs, investigate allele frequency changes within species, etc.), sometimes it is driven by the false equivalence "more data = better". The outcome, though, is that we have a body of more-or-less established microbial macroecological patterns which rest on 16S data and are now slowly incorporating results from metagenomics. To my knowledge, there has not been a systematic evaluation of the macroecological patterns that do and do not vary by one's choice in 16S vs. metagenomics. Several of the authors in this manuscript have previously compared the MAD shape for 16S and metagenomic datasets in Pasqualini et al., but moving forward, a more comprehensive study seems necessary (2024).

      References

      Pasqualini, Jacopo, et al. "Emergent ecological patterns and modelling of gut microbiomes in health and in disease." PLOS Computational Biology 20.9 (2024): e1012482.

    5. Author response:

      Reviewer #1:

      Strengths:

      (1) Using a fairly generic ecological model, the method can identify the change in the relative importance of different ecological forces (distribution of interspecies interactions, demographic noise, and immigration) in different sample groups. The authors focus on the case of the human gut microbiota, showing that the data are consistent with a higher influence of species interactions (relative to demographic noise and immigration) in a disease microbiota state than in healthy ones. (2) The method is novel, original, and it improves the state-of-the-art methodology for the inference of ecologically relevant parameters. The analysis provides solid evidence for the conclusions. 

      Weaknesses:

      In the way it is written, this work might be mostly read by physicists. We believe that, with some rewriting, the authors could better highlight the ecological implications of the results and make the method more accessible to a broader audience.

      We thank the reviewer for their positive and constructive feedback. We particularly appreciate the recognition of the novelty and robustness of our method, as well as the insight that it sheds light on the shifting ecological forces between healthy and diseased microbiomes. In response to the concern about the manuscript’s accessibility, we aim to revise key sections – including the Introduction, Results, and Discussion – to more clearly articulate the ecological relevance of our theoretical findings. We would like to emphasize that our approach offers a novel perspective for analyzing individual species' abundances, as well as for understanding interaction patterns and stability at the community level. By placing our results within a broader context accessible to readers from diverse backgrounds, we aim for the revised version to appeal to a wider audience, including ecologists and microbiome scientists, while preserving the rigor of our underlying statistical physics framework.

      Reviewer #2:

      Strengths:

      A well-written article, relatively easy to follow and transparent despite the high degree of technicality of the underlying theory. The authors provide a powerful inferring procedure, which bypasses the issue of having only compositional data. 

      Weaknesses:

      (1) This sentence in the introduction seems key to me: "Focusing on single species properties as species abundance distribution (SAD), it fails to characterise altered states of microbiome." Yet it is not explained what is meant by 'fail', and thus what the proposed approach 'solves'. (2) Lack of validation, following arbitrary modelling choices made (symmetry of interactions, weak-interaction limit, uniform carrying capacity). Inconsistent interpretation of instability. Here, instability is associated with the transition to the marginal phase, which becomes chaotic when interaction symmetry is broken. But as the authors acknowledge, the weak interaction limit does not reproduce fat-tailed abundance distributions found in data. On the other hand, strong interaction regimes, where chaos prevails, tend to do so (Mallmin et al, PNAS 2024). Thus, the nature of the instability towards which unhealthy microbiomes approach is unclear. (3) Three technical points about the methodology and interpretation. a) How can order parameters ℎ and 𝑞0 can be inferred, if in the compositional data they are fixed by definition? b) How is it possible that weaker interaction variance is associated with an approach to instability, when the opposite is usually true? c) Having an idea of what the empirical data compares to the theoretical fits would be valuable. Implications: As the authors say, this is a proof of concept. They point at limits and ways to go forward, in particular pointing at ways in which species abundance distributions could be better reproduced by the predicted dynamical models. One implication that is missing, in my opinion, is the interpretability of the results, and what this work achieves that was missing from other approaches (see weaknesses section above): what do we learn from the fact that changes in microbial interactions characterise healthy from unhealthy microbiota? For instance, what does this mean for medical research?

      We greatly appreciate the reviewer’s thoughtful analysis highlighting both the strengths and areas of ambiguity in our work.

      (1) To clarify the sentence on the limitations of species abundance distributions (SADs), we aim to explain in the revised version that while SADs summarize the relative abundance of individual species, they fail to capture the species-species correlations that we have shown (Seppi et al., Biomolecules 2023) to be more susceptible to the healthy state of the host. Our method thus focused on the interaction statistics among species, providing insights into underlying dynamics and stability of the microbiomes and their differences between healthy and unhealthy hosts.

      (2) Regarding model assumptions, we acknowledge that the weak interaction regime and symmetry hypotheses simplify the analysis and may not capture all empirical richness, such as fat-tailed distributions of species abundance. However, we interpret instability not as a path to chaos per se, but as a transition toward a multi-attractor phase, where each microbiome reaches a different fixed point. This is consistent with prior empirical findings invoking the “Anna Karenina principle”, where healthy microbiomes resemble one another, but disease states tend to deviate from this picture (see Pasqualini et al., PLOS Comp. Bio. 2024). We consider our framework as a starting point and agree that further extensions incorporating strong interaction regimes (as suggested by Mallmin et al., PNAS 2024) or relaxing other model assumptions could reveal even richer dynamical patterns. The computational pipeline we present can be, in fact, easily generalizable to include different population dynamics models.

      On the technical questions: (a) While compositional data constrain relative abundances, we can still estimate diversity-dependent parameters (h and q0) using alpha-diversity statistics across samples, which show meaningful variation; (b) The counter-intuitive instability that the reviewer pointed out arises from the interplay between demographic stochasticity and quenched disorder. It is the combined contribution of these two factors in phase space – not either one alone – that drives the transition. For clarity, see Figure 1 in Altieri et al., Phys. Rev. Lett. 2021; (c) We plan to include plots that compare empirical data to theoretical model fits. This will help visualize how well the model captures observed microbial community properties demographic noise (𝑇), healthy communities are more stable (i.e., distantσ from the and how even with larger species interaction heterogeneity (σ) and larger critical line), as measured, by the replicon eigenvalue. Finally, regarding interpretability and implications: by showing that ecological interaction networks – not just species identities – differ between healthy and unhealthy states, our work suggests a conceptual shift. This could inform medical strategies aimed at restoring community-level stability rather than targeting individual microbes. In the revised Discussion section, we will elaborate on this point to better highlight its practical implications and outline potential directions for future research.

      Reviewer #3:

      Strengths:

      The modeling efforts of this study primarily rely on a disordered form of the generalized Lotka-Volterra (gLV) model. This model can be appropriate for investigating certain systems, and the authors are clear about when and how more mechanistic models (i.e., consumer-resource) can lead to gLV. Phenomenological models such as this have been found to be highly useful for investigating the ecology of microbiomes, so this modeling choice seems justified, and the limitations are laid out. 

      Weaknesses:

      The authors use metagenomic data of diseased and healthy patients that were first processed in Pasqualini et al. (2024). The use of metagenomic data leads me to a question regarding the role of sampling effort (i.e., read counts) in shaping model parameters such as h. This parameter is equal to the average of 1/# species across samples because the data are compositional in nature. My understanding is that it was calculated using total abundances (i.e., read counts). The number of observed species is strongly influenced by sampling effort, so it would be useful if the number of reads were plotted against the number of species for healthy and diseased subjects. However, the role of sampling effort can depend on the type of data, and my instinct about the role that sampling effort plays in species detection is primarily based on 16S data. The dependency between these two variables may be less severe for the authors' metagenomic pipeline. This potential discrepancy raises a broader issue regarding the investigation of microbial macroecological patterns and the inference of ecological parameters. Often microbial macroecology researchers rely on 16S rRNA amplicon data because that type of data is abundant and comparatively low-cost. Some in microbiology and bioinformatics are increasingly pushing researchers to choose metagenomics over 16S. Sometimes this choice is valid (discovery of new MAGs, investigate allele frequency changes within species, etc.), sometimes it is driven by the false equivalence "more data = better". The outcome, though, is that we have a body of more-or-less established microbial macroecological patterns which rest on 16S data and are now slowly incorporating results from metagenomics. To my knowledge, there has not been a systematic evaluation of the macroecological patterns that do and do not vary by one's choice in 16S vs. metagenomics. Several of the authors in this manuscript have previously compared the MAD shape for 16S and metagenomic datasets in Pasqualini et al., but moving forward, a more comprehensive study seems necessary.

      We thank the reviewer for this insightful and nuanced comment, which particularly highlights the broader methodological context of our data sources. Indeed, metagenomic sequencing introduces different biases with respect to 16S data. First, we would like to emphasize that we estimated the order parameters from the data by using relative abundances. Second, while the concern regarding the influence of sequencing depth and species diversity on the estimation of the order parameters is valid, we refer to a previous publication by some of the authors (Pasqualini et al., 2024; see Figure 4, panels g and h). There, we pointed out that the observed outcome is weakly influenced by sequencing depth in our dataset, while the main impact on the order parameters estimate comes from the species diversity of the two groups. In the same publication, we showed that other well-known patterns (species abundance distribution, mean abundance distribution) are also observed. Also, to mitigate the effect of the number of samples and sequencing depth, we estimated the order parameters by a bootstrap procedure (90% of samples for healthy and diseased groups, 5000 resamples), which resulted in the error bars in Figure 2.

      We also fully agree with the broader call for a systematic comparison of macroecological patterns derived from 16S and metagenomic data. While some of us have already begun exploring this direction (e.g., Pasqualini et al., 2024), the reviewer’s comment highlights its significance and motivates us to pursue a more comprehensive, integrative analysis across data types. While we found qualitative agreement of these patterns with previous publications (e.g., Grilli, Nature Comm. 2020), we will acknowledge this as an important future direction in the Discussion section.

      References

      (1) Seppi, M., Pasqualini, J., Facchin, S., Savarino, E.V. and Suweis, S., 2023. Emergent functional organization of gut microbiomes in health and diseases. Biomolecules, 14(1), p.5.

      (2) Pasqualini, J., Facchin, S., Rinaldo, A., Maritan, A., Savarino, E. and Suweis, S., 2024. Emergent ecological patterns and modelling of gut microbiomes in health and in disease. PLOS Computational Biology, 20(9), p.e1012482.

      (3) Mallmin, E., Traulsen, A. and De Monte, S., 2024. Chaotic turnover of rare and abundant species in a strongly interacting model community. Proceedings of the National Academy of Sciences, 121(11), p.e2312822121.

      (4) Altieri, A., Roy, F., Cammarota, C., & Biroli, G. (2021). Properties of equilibria and glassy phases of the random Lotka-Volterra model with demographic noise. Physical Review Letters, 126(25), 258301.

      (5) Grilli, J. (2020). Macroecological laws describe variation and diversity in microbial communities. Nature communications, 11(1), 4743.

    1. eLife Assessment

      This study makes an important contribution to the molecular mechanisms of neural circuit formation. The data convincingly show that the transcription factor Sp1 regulates ephrin-mediated axon guidance in the spinal cord. Although the authors show that Sp1 and its co-activators p300 and CBP are required to induce ephrin expression, additional discussion and/or experiments are needed to support the claims that Sp1 regulates cis-binding of Epha receptors, or that Sp1 controls ephrin expression in relevant motor neuron populations. The study will be of broad interest to developmental neurobiologists.

    2. Reviewer #1 (Public review):

      The manuscript by Liao et al investigates the mechanisms that induce ephrin expression in spinal cord lateral motor column (LMC) neurons to facilitate axon guidance into the dorsal and ventral limb. The authors show that Sp1 and its co-activators p300 and CBP are required to induce ephrin expression to modulate the responsiveness of motor neurons to external ephrin cues. The study is well done and convincingly demonstrates the role of Sp1 in motor neuron axon guidance.

      Further discussion and clarification of some results would further improve the study.

      (1) The mechanism that the authors propose (Figure 7) and is also supported by their data is that Sp1 induces ephrinA5 in LMCm and ephrinB2 in LMCl to attenuate inappropriate responses to external ephrins in the limb. Therefore, deletion of Sp1 should result in mistargeting of LMCl and LMCm axons, as shown in the mouse data, but no overt changes in the number of axons in the ventral and dorsal limb. From the mouse backfills, it seems that an equal number of LMCm/LMCl project into the wrong side of the limb. However, the chick data show an increase of axons projecting into the ventral limb in the Sp1 knockout. Is this also true in the mouse? The authors state that medial and lateral LMC neurons differ in their reliance on Sp1 function but that is not supported by the mouse backfill data (27% vs 32% motor neurons mistargeted). Also, the model presented in Figure 7 does not explain how Sp1 overexpression leads to axon guidance defects.

      (2) The authors do not directly show changes in ephrin expression in motor neurons, either in chick or mouse, after Sp1 knockout, which is the basis of their model. The experiment in Figure 4G seems to be Sp1 overexpression rather than knockdown (as mentioned in the results) and NSC-34 cells may not be relevant to motor neurons in vivo. NSC-34 experiments are also not described in the methods.

      (3) There is no information about how the RNA-sequencing experiment was done (which neurons were isolated, how, at what age, how many replicates, etc) so it is hard to interpret the resulting data.

      (4) It is unclear why the authors chose to use a Syn1-cre driver rather than a motor neuron restricted cre driver. Since this is a broad neuronal cre driver, the behavioral defects shown in Figure 7 may not be solely due to Sp1 deletion in motor neurons. Are there other relevant neuronal populations that express Sp1 that are targeted by this cre-mediated deletion?

    3. Reviewer #2 (Public review):

      Summary:

      This study shows that transcription factor Sp1 is required for correct ventral vs. dorsal targeting of limb-innervating LMC motor neurons using mouse and chick as model systems. In a wild-type embryo, lateral LMC axons specifically target dorsal muscles while medial LMC axons target ventral muscles. The authors convincingly show that this specificity is lost when Sp1 is knocked down or knocked out - axons of both lateral and medial LMC motor neurons project to both dorsal and ventral muscles in mutant conditions. The authors then conduct RNA-seq and ChIP experiments to show that Sp1 loss of function disrupts Ephrin-Epha receptor signaling pathway genes. These molecules are known to provide attractive or repulsive cues to guide LMC axons to their targets. The authors show that attraction/repulsion properties of medial and lateral LMC axons to specific Ephrin/Epha molecules are in fact disrupted in Sp1 mutants using ex vivo explant studies. Finally, the authors show that behaviors like coordinated movement and grip strength are also affected in Sp1 mutant mice. This study convincingly shows that Sp1 is important for correct circuit wiring of LMC neurons, and moves the field forward by elucidating a new level of transcriptional regulation required in this process. However, the claims made by the authors that the mode of Sp1-mediated regulation is through cis-attenuation of Epha activity is not well supported. These and additional strengths and weaknesses in approach and in data interpretation are discussed below.

      Strengths:

      (1) The study convincingly shows that wildtype levels of Sp1 are necessary for LMC axon targeting specificity. The combination of the following approaches is a strength:<br /> a) Both loss of function and gain of function experiments are performed for Sp1 and show complementary effects on the axon targeting phenotype.<br /> b) Retrograde labeling of LMC neurons from dorsal and ventral muscles shows that Sp1 mutants clearly lose the specificity of LMC axon targeting.<br /> c) The authors also use explant experiments to show that both loss of Sp1 and gain of Sp1 show clear changes in attraction and repulsion to specific ephrin and epha receptor molecules.<br /> d) The Sp1 loss and gain of function experiments are well controlled to show that the changes in axon wiring observed are not due to cell death, cell fate switches, or due to unequal numbers of medial and lateral LMC neurons being labeled in the experiments.

      (2) It is also convincing that Sp1 requires cofactors p300 and CBP for its function. In the absence of these cofactors, the gain of function phenotypes of Sp1 are subdued.

      Weaknesses:

      (1) The robustness of RNAseq and ChIP experiments is difficult to judge as methods are not described. For example, it is unclear if RNAseq is performed on purified motor neurons or on whole spinal cords. This is an important consideration as Sp1 is a broadly expressed protein.

      (2) The authors state that expression of Ephrin A5 and Ephrin B2 is reduced based on RNAseq data, however, it is not shown that this reduction occurs specifically in LMC neurons.

      (3) The authors show Sp1 ChIP peaks at Ephrin B2 promoter, but nothing is mentioned about peaks at Eprin A5 or other types of signaling molecules like Sema7a, which are also differentially expressed in Sp1 mutants. There is also no mention of the correlation between changes in gene expression seen in RNAseq data and the binding profile of Sp1 seen in ChIP data, which could help establish the robustness of these datasets.

      (4) The authors conclude that Sp1 functions by activating Ephrin A5 in medial LMC and Ephrin B2 in lateral LMC. The argument, as I understand it, is that this activation leads to cis attenuation of their respective Epha receptors and therefore targeting the correct muscle. Though none of the data presented go against this hypothesis, this hypothesis is also not fully supported. Specifically:<br /> a) It would be important to know that modulation of Sp1 expression leads to changes in EphrinA5 and B2 in LMC lateral/medial neurons.<br /> b) It would also be important to show that none of the other changes caused by Sp1 are responsible for axon mistargeting by performing rescue experiments with Ephrin A5 and Ephrin B2.<br /> c) To make the most convincing case, experiments showing increased or decreased cis-binding of Ephrin molecules with Epha receptors would be necessary. This study would still be compelling without this last experiment, but the language in the abstract would need to be modulated.

      (5) All behavior experiments are done in a pan-neuronal knockout of Sp1. As Sp1 is broadly expressed in neurons, a statement describing whether and why the authors think the phenotypes arise from Sp1's function in LMC motor neurons would be helpful. Experimentally, rescue experiments in which Sp1 is restored in LMC neurons or motor neurons would also make this claim more convincing.

    4. Reviewer #3 (Public review):

      Summary:

      This is a compelling study on the role of Sp1 in motor axon trajectory selection, demonstrating that Sp1 is both necessary and sufficient for correct axon guidance in the limb. Sp1 regulates ephrin ligand expression to fine-tune Eph/ephrin signaling in the lateral motor column (LMC) neurons.

      Strengths:

      The study integrates multiple approaches. These include in ovo electroporation in chick embryos, conditional knockout mouse models, transcriptomic analyses, and functional assays such as stripe assays and behavioral testing-to provide robust evidence for Sp1's role in axon guidance mechanisms. The manuscript is well-written and scientifically rigorous, and the findings are of broad interest to the developmental neuroscience community.

      Weaknesses:

      Some aspects of the manuscript could be improved to enhance clarity, ensure logical flow, and strengthen the impact of the findings.

    5. Author response:

      Reviewer 1:

      (1) Clarification of axon mistargeting patterns and model interpretation

      We will clarify the apparent discrepancy between chick and mouse axon mistargeting data. Specifically, we will expand the explanation in the main text and Figure 7 legend and/or revise the model in Figure 7 to better reflect observed phenotypes and clarify how Sp1 overexpression contributes to mistargeting.

      (2) Evidence for Sp1-dependent ephrin expression

      We agree that demonstrating ephrin expression changes in motor neurons is essential. We will: • Conduct in situ hybridization and/or immunostaining for ephrins in control and Sp1 mutant spinal cords from both chick and mouse embryos.

      Clarify and expand the methodological details of the NSC-34 cell experiments shown in Figure 4G.

      (3) RNA-seq experiment details

      We will revise the Methods section to provide additional experimental details.

      (4) Use of Syn1-cre

      We acknowledge concerns about the broad expression of Syn1-cre. To address this:

      We will clarify our rationale for using Syn1-cre and describe its expression pattern in the spinal cord.

      We are evaluating the feasibility of additional experiments using a motor neuron-specific Cre driver to confirm cell-type specificity.

      We will include a new paragraph in the Discussion addressing potential contributions from other neuronal populations.

      Reviewer 2:

      (1) & (2) Clarification and localization of RNA-seq data

      We will expand the Methods section to provide greater detail on the RNA-seq approach. In addition, we will validate ephrin downregulation in LMC neurons using in situ hybridization and/or immunostaining.

      (3) Integration of ChIP and RNA-seq data We will:

      Report additional ChIP peaks for ephrinA5 and other differentially expressed genes such as Sema7a.

      Add a summary figure that integrates ChIP and RNA-seq results to strengthen the link between Sp1 binding and transcriptional regulation.

      (4) Clarification of the cis-attenuation model

      We recognize that our data do not yet directly demonstrate Sp1’s role in cis-attenuation. To address this:

      We will revise the abstract and main text to frame Sp1's role in cis-attenuation as a hypothesis. • We are exploring the feasibility of ephrinA5 and B2 rescue experiments in Sp1-deficient embryos to test specificity.

      (5) Behavioral phenotypes and cell-type specificity

      We will clarify that behavioral phenotypes may result from combined effects across neuron populations due to Syn1-cre expression. To address this:

      We are planning rescue experiments with Sp1 expression in chick embryos to test for rescue of axon misrouting.

      We will include a new paragraph in the Discussion to highlight this limitation and discuss alternative interpretations.

      Reviewer 3:

      We appreciate your positive evaluation and support for the rigor of our study.

      In response to your suggestions:

      We are revising the manuscript to improve clarity and flow, particularly the transitions between datasets.

      We will update Figure 7 and the associated text to more clearly convey the working model and avoid overinterpretation.

      We thank all reviewers for their constructive feedback and are committed to addressing each point thoroughly. All revisions will be clearly marked in the resubmitted manuscript.

    1. eLife Assessment

      This study offers valuable insights into the role of miR-283 in ventral-lateral neurons (LNvs) and its impact on senescence, cardiac function, and aging in the Drosophila melanogaster model. However, the evidence supporting some of the conclusions remains incomplete, and further mechanistic studies are needed to clarify how miR-283 affects normal aging and influences exercise adaptations. Nonetheless, the work can be of interest to cell biologists studying miRNA biology, aging, and age-related diseases.

    2. Reviewer #1 (Public review):

      In this study, Li et al et al. investigated the role of miR-283 in regulating cardiac aging and its potential contribution to age-related bradyarrhythmia. Using Drosophila as a model, the authors demonstrated that systemic overexpression or knockdown of miR-283 induced age-associated bradycardia. Notably, the study found that miR-283 knockdown in ventral-lateral neurons (LNvs), rather than in the heart, was sufficient to induce bradyarrhythmia, an effect the authors linked to the upregulation of miR-283 expression in both the brain and heart. The study also explored the beneficial impact of exercise on cardiac aging, showing that endurance training mitigated bradyarrhythmia, correlating with reduced miR-283 accumulation in the brain and myocardium.

      The conclusions of this paper are mostly well supported by data; however, some concerns arise from the unexpected finding that bradyarrhythmia was triggered by miR-283 knockdown in LNvs rather than in the heart, suggesting a non-cell-autonomous mechanism. A more precise mechanistic explanation linking miR-283 dysregulation in LNvs to cardiac dysfunction would strengthen the study's conclusions. While the authors propose cwo as a potential target of miR-283, no functional experiments were conducted to confirm its role in mediating miR-283's effects. Additionally, it remains unclear whether reduced miR-283 levels in LNvs lead to accelerated aging rather than a cardiac-specific effect. Likewise, the potential influence of miR-283 on the circadian clock and its broader impact on aging warrant further investigation.

      Major Comments:

      (1) A significant concern arises from the unexpected outcome observed in miR-283 knockdown in LNvs, which suggests a non-cell-autonomous mechanism. Elucidating the mechanisms by which miR-283 deficiency leads to the observed phenotypes would provide a more comprehensive understanding of the study's implications.

      (2) The authors propose cwo as a potential target of miR-283; however, no functional experiments were conducted to confirm its role in mediating miR-283's effects. Similarly, direct evidence demonstrating that cwo is a bona fide target of miR-283 in LNvs should be provided.

      (3) It remains unclear whether miR-283 knockdown in LNvs results in accelerated aging rather than a cardiac-specific effect. This hypothesis is supported by observations that pdf>miR-283SP animals exhibit systemic premature senescence (elevated SA-β-gal activity in both the heart and brain), cardiac dysfunction, impaired climbing ability, and reduced lifespan.

      (4) The finding that reduced miR-283 levels in LNvs lead to accelerated aging raises an important, yet unexplored, question: does miR-283 influence the circadian clock, thereby broadly affecting aging?

      Two aspects of this question should be addressed:<br /> (a) Is the circadian rhythm disrupted in miR-283 knockdown experiments?<br /> (b) Do circadian rhythm defects impact aging?

      (5) The authors state that miR-283 knockdown in LNvs led to bradyarrhythmia, which was mainly caused by miR-283 upregulation in the whole brain and heart. However, it is unclear which experiments support this conclusion. Could the authors clarify this point?

      (6) Given that miR-283 expression varies with age, could the upregulation of miR-283 in both the brain and heart be a consequence of accelerated aging rather than a specific effect of miR-283 knockdown in LNvs?

      (7) While the beneficial effects of exercise on cardiac function appear clear, the claim that this effect is mediated through miR-283 function in LNvs seems premature. The data suggest that exercise-induced improvement occurs in both wild-type and miR-283-SP animals, raising the possibility that exercise acts through a miR-283-independent mechanism.

    3. Reviewer #2 (Public review):

      Summary:

      The manuscript presents findings that indicate a role in controlling Drosophila heart rate for a conserved miRNA (miR-238 in flies). Further, the manuscript localizes the relevant tissue for the function of this miRNA to a subset of neurons that are heavily involved in circadian regulation, thus presenting an interesting mechanistic link between the circadian system and heart rate. Either ubiquitous knockout or ubiquitous overexpression negatively impacts several aspects of heart performance, with a pronounced effect on heart rate. Interestingly, knockdowns in the heart itself are innocuous, but knockdown in LNvS neurons recapitulates the effect on heart rate. Authors use bioinformatics to identify the clockwork orange (cwo) gene as a potential target and validate that cwo expression is reduced when miR-238 is knocked down in LNvS neurons in vivo and also validate that cwo is regulated by miR-238 in cell culture luciferase assays. Exercise shows a modest ability to restore normal cwo expression and a trend toward an effect on survival, but shows a much stronger rescue of the heart rate phenotype.

      Strengths:

      Evidence is strong for the effect of miR-238 in pdf-positive neurons on the control of heart rate and for cwo as a downstream effector of miR-238.

      Work to identify specific targets of miR-283 is well-done and successfully identified a key downstream regulator in cwo.

      The potential mechanism using miR-238 to link circadian neurons to heart rate regulation is novel and exciting.

      Weaknesses:

      The evidence that this is related to normal aging is rather weak, and the effect of exercise on the observed parameters is small and not necessarily working through the miR-238/cwo mechanism.

      The authors seem to be conflating two hypotheses in their interpretations. Is miR-283 working through circadian mechanisms or age-related mechanisms? While it is true that aging tends to reduce heart rate, I don't think that means that any intervention that reduces heart rate is causing "senescence". Similarly, reduced survival in miR-283 knockdown flies does not prove that miR-283 promotes healthy aging per se, just that miR-283 is required for health regardless of age.

      Survival reduction is quite modest which does not necessarily support the idea that the bradycardia is causing major health issues or premature senescence for the flies. The interpretation of the longevity experiments throughout the manuscript seems overstated.

      The study would benefit greatly from a direct test of the author's proposed pathway for exercise to improve bradycardia.

      The statement in the discussion "inducing endurance exercise of anti gravity climbing in flies with miR-283 knockdown in LNvs can improve bradyarrhythmic features by decreasing brain miR-283 expression" is not fully supported by data in the paper. There is an association there, but it cannot be said to be the full cause (or even required) without doing more experiments

      The summary figure includes both data-supported mechanistic relationships and mechanisms that are inferred or assumed.

    1. eLife Assessment

      This study presents useful findings on the role of AFD thermosensory neurons in locomotory behaviours. The study appears solid with respect to parsing out the non-thermosensory role of AFD and also brings to light the role of AFD and AIB (linked through electrical synapses) in tactile-dependent locomotory modulation.

    2. Reviewer #1 (Public review):

      Summary:

      In this manuscript, Rosero and Bai examined how the well-known thermosensory neuron in C. elegans, AFD, regulates context-dependent locomotory behavior based on the tactile experience. Here they show that AFD uses discrete cGMP signalling molecules and independent of its dendritic sensory endings regulates this locomotory behavior. The authors also show here that AFD's connection to one of the hub interneurons, AIB, through gap junction/electrical synapses, is necessary and sufficient for the regulation of this context-dependent locomotion modulation.

      Strengths:

      This is an interesting paper showcasing how a sensory neuron in C. elegans can employ a distinct set of molecular strategies and different physical parts to regulate a completely distinct set of behaviors, which were not been shown to be regulated by AFD before. The experiments were well performed and the results are clear. However, there are some questions about the mechanism of this regulation. This reviewer thinks that the authors should address these concerns before the final published version of this manuscript.

      Weaknesses:

      (1) The authors argued about the role of prior exposure to different physical contexts which might be responsible for the difference in their locomotory behaviour. However, the worms in the binary chamber (with both non-uniformly sized and spaced pillars) experienced both sets of pillars for one hour prior to the assay and they were also free to move between two sets of environments during the assay. So, this is not completely a switch between two different types of tactile barriers (or not completely restricted to prior experience), but rather a difference between experiencing a more complex environment vs a simple uniform environment. They should rephrase their findings. To strictly argue about the prior experience, the authors need to somehow restrict the worms from entering the uniform assay zone during the 1hr training period.

      (2) The authors here argued that the sensory endings of AFD are not required for this novel role of AFD in context-dependent locomotion modulation. However, gcy-18 has been shown to be exclusively localized to the ciliated sensory endings of AFD and even misexpression of GCY-18 in other sensory neurons also leads to localizations in sensory endings (Nguyen et. al., 2014 and Takeishi et. al., 2016). They should check whether gcy-18 or tax-2 gets mislocalized in kcc-3 or tax-1 mutants.

      (3) MEC-10 was shown to be required for physical space preference through its action in FLP and not the TRNs (PMID: 28349862). Since FLP is involved in harsh touch sensation while TRNs are involved in gentle touch sensation, which are the neuron types responsible for tactile sensation in the assay arena? Does mec-10 rescue in TRNs rescue the phenotype in the current paper?

      (4) The authors mention that the most direct link between TRNs and AFD is through AIB, but as far as I understand, there are no reports to suggest synapses between TRNs and AIB. However, FLP and AIB are connected through both chemical and electrical synapses, which would make more sense as per their mec-10 data. (the authors mentioned about the FLP-AIB-AFD circuit in their discussion but talked about TRNs as the sensory modality). mec-10 rescue experiment in TRNs would clarify this ambiguity.

      (5) Do inx-7 or inx-10 rescue in AFD and AIB using cell-specific promoters rescue the behaviour?

      (6) How Guanylyl cyclase gcy-18 function is related to the electrical synapse activity between AFD and AIB? Is AFD downstream or upstream of AIB in this context?

    3. Reviewer #2 (Public review):

      Summary:

      The goal of the study was to uncover the mechanisms mediating tactile-context-dependent locomotion modulation in C. elegans, which represents an interesting model of behavioral plasticity. Starting from a candidate genetic screen focusing on guanylate cyclase (GCY) mutants, the authors identified the AFD-specific gcy-18 gene as essential for tactile-context-dependent locomotion modulation. AFD is primarily characterized as a thermo-sensory neuron. However, key thermosensory transduction genes and the sensory ending structure of AFD were shown here to be dispensable for tactile-context locomotion modulation. AFD actuates tactile-context locomotion modulation via the cell-autonomous actions of GCY-18 and the CNG-3 cyclic nucleotide-gated channel, and via AFD's connection with AIB interneurons through electrical synapses. This represents a potentially relevant synaptic connection linking AFD to the mechanosensory-behavior circuit.

      Strengths:

      (1) The fact that AFD mediates tactile-context locomotion modulation is new, rather surprising, and interesting.

      (2) The authors have combined a very clever microfluidic-based behavioral assay with a large set of genetic manipulations to dissect the molecular and cellular pathways involved. Rescue experiments with single-copy transgenes are very convincing.

      (3) The study is very clearly written, and figures are nicely illustrated with diagrams that effectively convey the authors' interpretation.

      Weaknesses:

      (1) Whereas GCY-18 in AFD and the AFD-AIB synaptic connection clearly play a role in tactile-context locomotion modulation, whether and how they actually modulate the mechanosensory circuit and/or locomotion circuit remains unclear. The possibility of non-synaptic communication linking mechanosensory neurons and AFD (in either direction) was not explored. Thus, in the end, we have not learned much about what GCY-18 and the AFD-AIB module are doing to actuate tactile context-dependent locomotion modulation.

      (2) The authors only focused on speed readout, and we don't know if the many behavioral parameters that are modulated by tactile context are also under the control of AFD-mediated modulation.

      (3) The AFD-AIB gap junction reconstruction experiment was conducted in an innexin double mutant background, in which the whole nervous system's functioning might be severely impaired, and its results should be interpreted with this limitation in mind.

    4. Reviewer #3 (Public review):

      Summary:

      Rosero and Bai report an unconventional role of AFD neurons in mediating tactile-dependent locomotion modulation, independent of their well-established thermosensory function. They partially elucidate the signaling mechanisms underlying this AFD-dependent behavioral modulation. The regulation does not require the sensory dendritic endings of AFD but rather the AFD neurons themselves. This process involves a distinct set of cGMP signaling proteins and CNG channel subunits separate from those involved in thermosensation or thermotaxis. Furthermore, the authors demonstrate that AIB interneurons connect AFD to mechanosensory circuits through electrical synapses. They conclude that, beyond its primary function in thermosensation, AFD contributes to context-dependent neuroplasticity and behavioral modulation via broader circuit connectivity.

      While the discovery of multifunctionality in AFD is not entirely unexpected, given the limited number of neurons in C. elegans (302 in total), the molecular and cellular mechanisms underlying this AFD-dependent behavioral modulation, as revealed in this study, provide valuable insights into the field.

      Strengths:

      (1) The authors uncover a novel role of AFD neurons in mediating tactile-dependent locomotion modulation, distinct from their well-established thermosensory function.

      (2) They provide partial insights into the signaling mechanisms underlying this AFD-dependent behavioral modulation.

      (3) The neural behavior assays utilizing two types of microfluidic chambers (uniform and binary chambers) are innovative and well-designed.

      (4) By comparing AFD's role in locomotion modulation to its thermosensory function throughout the study, the authors present strong evidence supporting these as two independent functions of AFD.

      (5) The finding that AFD contributes to context-dependent behavioral modulation is significant, further reinforcing the growing evidence that individual neurons can serve multiple functions through broader circuit connectivity.

      Weaknesses:

      (1) Limited Behavioral Assays: The study relies solely on neural behavior assays conducted using two types of microfluidic chambers (uniform and binary chambers) to assess context-dependent locomotion modulation. No additional behavioral assays were performed. To strengthen the conclusions, the authors should validate their findings using an independent method, at the very least by testing AFD-ablated animals and gcy-18 mutants with a second behavioral approach.

      (2) Clarity in Behavioral Assay Methodology: The methodology for conducting the behavioral assays is unclear. It appears that worms were free to move between the exploration and assay zones, with no control over the duration each worm spent in either zone. This lack of regulation may introduce variability in tactile experience across individuals, potentially affecting the reproducibility and quantitativeness of the method. The authors should clarify whether and how they accounted for this variability.

      (3) Potential Developmental and Behavioral Confounds in Mutant Analysis: Several neuronal mutant strains were used in this study, yet the effects of these mutations on development and general behavior (e.g., movement ability) were not discussed. Although young adult worms were used for behavioral assays, were they at similar biological ages? To rule out confounding factors, locomotion assays assessing movement ability should be conducted (see reference PMID 25561524).

      (4) Definition and Baseline Measurements for Locomotion Categories: The finding that tax-4 and kcc-3 contribute to basal locomotion but not to context-dependent locomotion modulation is intriguing. The authors argue that distinct mechanisms regulate these two processes; however, the study does not clearly define the concepts of "basal locomotion" and "context-dependent locomotion," nor does it provide baseline measurements. A clear definition and baseline data are needed to support this conclusion.

    1. eLife Assessment

      This important study provides a description of how single-neuron firing rates in the human medial temporal lobe and frontal cortex are modulated by theta-burst stimulation of the basolateral amydala. The results are supported by solid evidence obtained from a rigorous task design and analysis of an incredibly rare dataset. The results may help guide future studies incorporating amygdala stimulation to improve patient health. Additional analyses could have been performed, and additional experimental details included, to address open questions related to mechanistic effects of the stimulation protocol on single unit properties and memory-related behavior.

    2. Reviewer #1 (Public review):

      Summary:

      In this manuscript, Campbell et al. assess how intracranial theta-burst stimulation (TBS) applied to the basolateral amygdala in 23 epilepsy patients affects neuronal spiking in the medial temporal lobe and prefrontal cortex during a visual recognition memory task.

      Strengths:

      This is an incredibly rare dataset; collecting single-unit spiking data from behaving humans during active intracranial stimulation is a Herculaean task, with immense potential for translational studies of how stimulation may be applied to modulate biological mechanisms of memory. The authors utilize careful, high-quality methodology throughout (e.g. task design, spike recording and sorting, statistical analysis), providing high confidence in the validity of their findings.

      Weaknesses:

      (1) This is an exploratory study that doesn't explore quite enough. Critically, the authors make a point of mentioning that neuronal firing properties vary across cell types, but only use baseline firing rate as a proxy metric for cell type. This leaves several important explorations on the table, not limited to the following:<br /> a) Do waveform shape features, which can also be informative of cell type, predict the effect of stimulation?<br /> b) Is the autocorrelation of spike timing, which can be informative about temporal dynamics, altered by stimulation? This is especially interesting if theta-burst stimulation either entrains theta-rhythmic spiking or is more modulatory of endogenously theta-modulated units.<br /> c) The authors reference the relevance of spike-field synchrony (30-55 Hz) in animal work, but ignore it here. Does spike-field synchrony (comparing the image presentation to post-stimulation) change in this frequency range? This does not seem beyond the scope of investigation here.<br /> d) How does multi-unit activity respond to stimulation? At this somewhat low count of neurons (total n=156 included) it would be valuable to provide input on multi-unit responses to stimulation as well.<br /> e) Several intracranial studies have implicated proximity to white matter in determining the effects of stimulation on LFPs; do the authors see an effect of white matter proximity here?

      (2) It is a little confusing to interpret stimulation-induced modulation of neuronal spiking in the absence of stimulation-induced change in behavior. How do the authors findings tell us anything about the neural mechanisms of stimulation-modulated memory if memory isn't altered? In line with point #1, I would suggest a deeper dive into behavior (e.g. reaction time? Or focus on individual sessions that do change in Figure 4A?) to make a stronger statement connecting the neural results to behavioral relevance.

      (3) It is not clear to me why the assessment of firing rates after image onset and after stim offset is limited to one second - this choice should be more theoretically justified, particularly for regions that spike as sparsely as these.

      (4) This work coincides with another example of human intracranial stimulation investigating the effect on firing rates (doi: https://doi.org/10.1101/2024.11.28.625915). Given how incredibly rare this type of work is, I think the authors should discuss how their work converges with this work (or doesn't).

      (5) What information does the pseudo-population analysis add? It's not totally clear to me.

    3. Reviewer #2 (Public review):

      Summary:

      This study presents a valuable characterization of the effects of intracranial theta-burst stimulation of the basolateral amygdala on single units spiking activity in several areas in the human brain, associated with memory processing. It is written clearly and concisely, allowing readers to fully understand the analysis used.

      The authors used a visual recognition memory task previously employed by their group to characterize the effects of basolateral amygdala stimulation upon memory consolidation (Inman et al, 2018). This current report is an interesting analysis to complement the results reported in the 2018 paper.

      Strengths:

      Rare combination of human neurophysiology and behavior -<br /> The type of experiment performed in the manuscript, which contains both neurophysiological data, behavior, and a deep brain stimulation intervention (DBS), is incredibly rare, takes many years to accomplish with tight collaboration between clinical and research teams. Our understanding of spiking dynamics of human neurons is very limited, and this report is an important piece in the puzzle that allows DBS to be used in future interventions that will benefit patients' health.

      Multiple brain areas included -<br /> It's important to note that the report analyzes brain areas with which the Amygdala has extensive connections (Fig. 1A) - Hippocampus, OFC, Amygdala, ACC. It seems that neurons in all these areas were modulated by the stimulation, except the ACC, in which firing rates were so low, that only a handful of neurons were included in the analysis. This is an important demonstration that low amplitude stimulation (even when reduced to 0.5mA) can travel far and wide across the human brain.

      The experiment is cleverly designed to tease apart responses due to visual stimuli (image presentation) and electrical stimulation. Authors suggest that the units modulated by stimulation are largely distinct from those responsive to image offset during trials without stimulation. The subpopulation that responds strongly also tends to have a higher baseline of firing rate. It's important to add that the chosen modulation index is more likely to be significant in neurons with higher firing rates.

      Weaknesses:

      Readers can benefit from understanding with more details the locations chosen for stimulation - in light of previous studies that found differences between effects based on proximity to white matter (For example - PMID 32446925, Mohan et al, Brain Stimul. 2020 and PMID 33279717 Mankin et al Brain Stimul. 2021).

    1. eLife Assessment

      This valuable paper provides refined gene expression datasets for 52 neuron classes in C. elegans using a new method that takes advantage of the complementary strengths of bulk sequencing of flow-sorted cells and single-cell sequencing. In general, support for the paper's findings is convincing. However, more rigorous consideration of some of the method's statistical assumptions and validation of the predicted gene sets would improve the work.

    2. Reviewer #1 (Public review):

      This is an interesting manuscript aimed at improving the transcriptome characterization of 52 C. elegans neuron classes. Previous single-cell RNA seq studies already uncovered transcriptomes for these, but the data are incomplete, with a bias against genes with lower expression levels. Here, the authors use cell-specific reporter combinations to FACS purify neurons and bulk RNA sequencing to obtain better sequencing depth. This reveals more rare transcripts, as well as non-coding RNAs, pseudogenes, etc. The authors develop computational approaches to combine the bulk and scRNA transcriptome results to obtain more definitive gene lists for the neurons examined.

      To ultimately understand features of any cell, from morphology to function, an understanding of the full complement of the genes it expresses is a pre-requisite. This paper gets us a step closer to this goal, assembling a current "definitive list" of genes for a large proportion of C. elegans neurons. The computational approaches used to generate the list are based on reasonable assumptions, the data appear to have been treated appropriately statistically, and the conclusions are generally warranted. I have a few issues that the authors may choose to address:

      (1) As part of getting rid of cross-contamination in the bulk data, the authors model the scRNA data, extrapolate it to the bulk data and subtract out "contaminant" cell types. One wonders, however, given that low expressed genes are not represented in the scRNA data, whether the assignment of a gene to one or another cell type can really be made definitive. Indeed, it's possible that a gene is expressed at low levels in one cell, and high levels in another, and would therefore be considered a contaminant. The result would be to throw out genes that actually are expressed in a given cell type. The definitive list would therefore be a conservative estimate, and not necessarily the correct estimate.

      (2) It would be quite useful to have tested some genes with lower expression levels using in vivo gene-fusion reporters to assess whether the expression assignments hold up as predicted. i.e. provide another avenue of experimentation, non-computational, to confirm that the decontamination algorithm works.

      (3) In many cases, each cell class would be composed of at least 2 if not more neurons. Is it possible that differences between members of a single class would be missed by applying the cleanup algorithms? Such transcripts would be represented only in a fraction of the cells isolated by scRNAseq, and might then be considered not real.

      (4) I didn't quite catch whether the precise staging of animals was matched between the bulk and scRNAseq datasets. Importantly, there are many genes whose expression is highly stage-specific or age-specific so even slight temporal differences might yield different sets of gene expression.

      (5) To what extent does FACS sorting affect gene expression? Can the authors provide some controls?

    3. Reviewer #2 (Public review):

      Summary:

      This study from the CenGEN consortium addresses several limitations of single-cell RNA (scRNA) and bulk RNA sequencing in C. elegans with a focus on cells in the nervous system. scRNA datasets can give very specific expression profiles, but detecting rare and non-polyA transcripts is difficult. In contrast, bulk RNA sequencing on isolated cells can be sequenced to high depth to identify rare and non-polyA transcripts but frequently suffers from RNA contamination from other cell types. In this study, the authors generate a comprehensive set of bulk RNA datasets from 53 individual neurons isolated by fluorescence-activated cell sorting (FACS). The authors combine these datasets with a previously published scRNA dataset (Taylor et al., 2021) to develop a novel method, called LittleBites, to estimate and subtract contamination from the bulk RNA data. The authors validate the method by comparing detected transcripts against gold-standard datasets on neuron-specific and non-neuronal transcripts. The authors generate an "integrated" list of protein-coding expression profiles for the 53 neuron sub-types, with fewer but higher confidence genes compared to expression profiles based only on scRNA. Also, the authors identify putative novel pan-neuronal and cell-type specific non-coding RNAs based on the bulk RNA data. LittleBites should be generally useful for extracting higher confidence data from bulk RNA-seq data in organisms where extensive scRNA datasets are available. The additional confidence in neuron-specific expression and non-coding RNA expands the already great utility of the neuronal expression reference atlas generated by the CenGEN consortium.

      Strengths:

      The study generates and analyzes a very comprehensive set of bulk RNA datasets from individual fluorescently tagged transgenic strains. These datasets are technically challenging to generate and significantly expand our knowledge of gene expression, particularly in cells that were poorly represented in the initial scRNA-seq datasets. Additionally, all transgenic strains are made available as a resource from the Caenorhabditis Elegans Genetics Center (CGC).

      The study uses the authors' extensive experience with neuronal expression to benchmark their method for reducing contamination utilizing a set of gold-standard validated neuronal and non-neuronal genes. These gold-standard genes will be helpful for benchmarking any C. elegans gene expression study.

      Weaknesses:

      The bulk RNA-seq data collected by the authors has high levels of contamination and, in some cases, is based on very few cells. The methodology to remove contamination partly makes up for this shortcoming, but the high background levels of contaminating RNA in the FACS-isolated neurons limit the confidence in cell-specific transcripts.

      The study does not experimentally validate any of the refined gene expression predictions, which was one of the main strengths of the initial CenGEN publication (Taylor et al, 2021). No validation experiments (e.g., fluorescence reporters or single molecule FISH) were performed for protein-coding or non-coding genes, which makes it difficult for the reader to assess how much gene predictions are improved, other than for the gold standard set, which may have specific characteristics (e.g., bias toward high expression as they were primarily identified in fluorescence reporter experiments).

      The study notes that bulk RNA-seq data, in contrast to scRNA-seq data, can be used to identify which isoforms are expressed in a given cell. However, no analysis or genome browser tracks were supplied in the study to take advantage of this important information. For the community, isoform-specific expression could guide the design of cell-specific expression constructs or for predictive modeling of gene expression based on machine learning.

    4. Reviewer #3 (Public review):

      The manuscript by Barrett et al. "Integrating bulk and single cell RNA-seq refines transcriptomic profiles of individual C. elegans neurons" presents a comprehensive approach to integrating bulk RNA-seq and single-cell RNA-seq (scRNA-seq) data to refine transcriptomic profiles of individual C. elegans neurons. The study addresses the limitations of scRNA-seq, such as the under-detection of lowly expressed and non-polyadenylated transcripts, by leveraging the sensitivity of bulk RNA-seq. The authors deploy a computational method, LittleBites, to remove non-neuronal contamination in bulk RNA-seq, that aims to enhance specificity while preserving the sensitivity advantage of bulk sequencing. Using this approach, the authors identify lowly expressed genes and non-coding RNAs (ncRNAs), many of which were previously undetected in scRNA-seq data.

      Overall, the study provides high-resolution gene expression data for 53 neuron classes, covering a wide range of functional modalities and neurotransmitter usage. The integrated dataset and computational tools are made publicly available, enabling community-driven testing of the robustness and reproducibility of the study. Nevertheless, while the study represents a relevant contribution to the field, certain aspects of the work require further refinement to ensure the robustness and rigor necessary for peer-reviewed publication. Below, I outline the areas where improvements are needed to strengthen the overall impact and reliability of the findings.

      (1) The study relies on thresholding to determine whether a gene is expressed or not. While this is a common practice, the choice of threshold is not thoroughly justified. In particular, the choice of two uniform cutoffs across protein-encoding RNAs and of one distinct threshold for non-coding RNAs is somewhat arbitrary and has several limitations. This reviewer recommends the authors attempt to use adaptive threshold-methods that define gene expression thresholds on a per-gene basis. Some of these methods include GiniClust2, Brennecke's variance modeling, HVG in Seurat, BASiCS, and/or MAST Hurdle model for dropout correction.

      (2) Most importantly, the study lacks independent experimental validation (e.g., qPCR, smFISH, or in situ hybridization) to confirm the expression of newly detected lowly expressed genes and non-coding RNAs. This is particularly important for validating novel neuronal non-coding RNAs, which are primarily inferred from computational approaches.

      (3) The novel biology is somewhat limited. One potential area of exploration would be to look at cell-type specific alternative splicing events.

      (4) The integration method disproportionately benefits neuron types with limited representation in scRNA-seq, meaning well-sampled neuron types may not show significant improvement. The authors should quantify the impact of this bias on the final dataset.

      (5) The authors employ a logit transformation to model single-cell proportions into count space, but they need to clarify its assumptions and potential pitfalls (e.g., how it handles rare cell types).

      (6) The LittleBites approach is highly dependent on the accuracy of existing single-cell references. If the scRNA-seq dataset is incomplete or contains classification biases, this could propagate errors into the bulk RNA-seq data. The authors may want to discuss potential limitations and sensitivity to errors in the single-cell dataset, and it is critical to define minimum quality parameters (e.g. via modeling) for the scRNAseq dataset used as reference.

      (7) Also very important, the LittleBites method could benefit from a more intuitive explanation and schematic to improve accessibility for non-computational readers. A supplementary step-by-step breakdown of the subtraction process would be useful.

      (8) In the same vein, the ROC curves and AUROC comparisons should have clearer annotations to make results more interpretable for readers unfamiliar with these metrics.

      (9) Finally, after the correlation-based decontamination of the 4,440 'unexpressed' genes, how many were ultimately discarded as non-neuronal?<br /> a) Among these non-neuronal genes, how many were actually known neuronal genes or components of neuronal pathways (e.g., genes involved in serotonin synthesis, synaptic function, or axon guidance)?<br /> b) Conversely, among the "unexpressed" genes classified as neuronal, how many were likely not neuron-specific (e.g., housekeeping genes) or even clearly non-neuronal (e.g., myosin or other muscle-specific markers)?

      (10) To increase transparency and allow readers to probe false positives and false negatives, I suggest the inclusion of:<br /> a) The full list of all 4,440 'unexpressed' genes and their classification at each refinement step. In that list flag the subsets of genes potentially misclassified, including:<br /> - Neuronal genes wrongly discarded as non-neuronal.<br /> - Non-neuronal genes wrongly retained as neuronal.<br /> b) Add a certainty or likelihood ranking that quantifies confidence in each classification decision, helping readers validate neuronal vs. non-neuronal RNA assignments.<br /> This addition would enhance transparency, reproducibility, and community engagement, ensuring that key neuronal genes are not erroneously discarded while minimizing false positives from contaminant-derived transcripts.

    1. eLife Assessment

      This study presents valuable findings on the role of specific dopamine neurons for aversive learning and modulation of innate behavior in Drosophila larvae. The authors present solid evidence backed up by detailed behavioral quantification and rigorous testing. Their data confirms previous findings and will be of interest to the learning and memory community.

    2. Reviewer #1 (Public review):

      Summary:

      The authors investigate the role of different specific dopaminergic neurons in the mushroom body of Drosophila larvae for learning and innate behavior. All the tested neurons are thought to be involved in punishment learning. The authors discover that artificial activation of single DANs in training leads to safety learning, but not punishment learning. Furthermore, activation of single DANs can lead to changes in locomotion behavior, which can affect light preference. The authors provide a deeper understanding of the functional diversity of single dopamine neurons; however, it is unclear how translatable these findings are to learning experiments with real punishment stimuli.

      Strengths:

      The authors attempt to disentangle what kind of memories are formed with the activation of different dopamine neurons - safety learning, and punishment learning, will the US be required to test for recall or not? They do indeed find differences and the results will be of interest to the learning and memory community.

      Interestingly, optogenetic activation of a single DAN during training leads to safety memory, but not punishment memory. Furthermore, DAN activation also affects innate locomotion, and the authors can show that optogenetic activation of different DANs affects locomotion differently.

      Weaknesses:<br /> All experiments in the manuscript use optogenetic activation of DANs, thus it is not clear what kind of memories are formed. Several stimuli can be used as punishment, such as electric shock, salt, bitter, and light - it is not clear what kind of memory the authors investigate here. The findings could be discussed in the context of what DANs respond to. Furthermore, studies in adults and larvae showed that most DANs can code for both valences - etc., aversive DANs can be activated by punishment, and inhibited by reward. Thus, safety learning might be a result of a decrease in activity in DANs during odor presentation. The authors also do not discuss possible feedback loops from MBONs to DANs across compartments. Could such connections allow for safety learning in larvae?

      The authors show that artificial activation with different light intensities can form different memories and that increasing the light intensity sometimes leads to no memories. Also, using different optogenetic tools reveals different results. This again raises the question of how applicable the results will be for learning with real stimuli. Is there a natural stimulus that only induces safety learning, but no punishment learning?<br /> The authors provide a detailed behavioral analysis of locomotion behavior; however, the detailed analysis seems unnecessary for that dataset. Modulation of speed and bending rate has been described before with simpler methods (specifically for MBONs). The revealed locomotion phenotypes probably affect larval locomotion during memory recall with light activation, thus the authors should show that larvae are potentially able to move during light-on memory tests.

    3. Reviewer #2 (Public review):

      Summary:

      This study provides valuable context for ongoing research on the role of dopamine in memory and locomotion. DANs have been a fascinating area of study due to their complexity, and this work dissects specific DANs, exploring their roles in different memory-related behaviors while offering some explanations. The discussions provided by the authors effectively situates the study in the broader field of learning, memory, DAN circuitry and behavioral computation in insect brains. The study achieves what it sets out to and it does so unequivocally. The experiments were elegantly designed, leaving little room for doubt in the study's claims. However, the study lacks context regarding the molecular pathways underlying these results. While it strengthens current knowledge by providing robust evidence, it does little to explore the molecular mechanisms behind these effects.

      Strengths:

      (1) Experiment design is one of the strengths of this study. The experiments are thorough and cover the length and breadth of the core findings of the study. Although a lot of work has already been done in studying the role of dopamine in memory and locomotion, the dissection of the functions of distinct DANs in larvae has been done meticulously with well-structured experiments.<br /> (2) This study fits quite nicely into the puzzle of memory, especially in the context of Dopamine. Previous studies in *Drosophila* adults have shown the opposing roles of DANs in locomotion depending on the context of DAN activation. This study drives that point home for larvae, providing conclusive evidence in that regard.<br /> (3) The use of clear figures and simple language is one of the strengths of this paper. The figures are comprehensive, complete and manage to narrate the story by themselves. The flow of information is smooth. The simple and effective language used maintains scientific rigor while remaining accessible to those new to the field. A pleasant read.

      Weaknesses:<br /> (1) The authors have done a great job at structuring the figures. But some main figures would benefit from including the controls instead of placing them in supplementary.<br /> (2) The paper would benefit from a deeper discussion regarding molecular mechanisms underlying their results. It would be interesting to see what the authors think about different Dopamine receptors and how they relate to the findings of this paper.<br /> (3) Throughout the paper, the authors have been clear and comprehensive, but in some cases, further explanation of their choices were missing. For example, the choice to compare bending and tail velocity over other parameters within the same clusters is unclear.

    4. Reviewer #3 (Public review):

      Summary

      Across species, dopamine release carries out seemingly diverse functions, like reinforcing memories and regulating locomotion and flight. However, whether distinct dopaminergic neurons (DANs) are allocated for each function is not clear. In this study, Toshima et al. have used the numerically simple organization of the Drosophila larval brain to answer this question. They use optogenetic activation to systematically stimulate a small set of DANs, individually and collectively, and study the effect on diverse functions such as memory formation, retrieval, and locomotion. They find that singly or collectively, DL1 DANs can induce punishment and/or safety memory formation and retrieval. DANs can even gate the expression of memory. Finally, the same DANs also modulate locomotion in the larvae. The authors speculate that dopaminergic neurons in other species may also share such overlapping functions. Their findings are nicely summarised in Figure 9.

      Strengths

      The study comprehensively activates the neurons in the DL1 cluster in a systematic manner. Individual and collective stimulation of the Dl1 DANs has been conducted to assess the induction and gating of aversive punishment memory, safety memory, and acute locomotion.

      Specific adult Drosophila DANs are known to induce dual behaviors and functions. The same MP1/y1pedc DANs are recognized for gating appetitive memory expression and representing aversive teaching signals downstream of sensory stimuli such as electric shocks, bitter tastes, and heat. Neurons in the PPL1 cluster regulate adult flight and food-seeking behavior. The authors deserve credit for conducting an organized examination of dopaminergic neuron functions in larvae, which makes their findings more comparable and facilitates the proposal of a holistic model.

      They have provided substantial evidence for their findings and frequently presented replicated behavioral data sets. They have been transparent about results that were difficult to explain. Additionally, they have provided an impressive body of supporting data to strengthen their main findings.

      Weaknesses

      The larvae exhibit directed locomotory action to express punishment or safety memory. If the larvae did not move, we would not be able to assess memory function. Hence, functional activation of DANs could result in one action, which seems like two different functions of memory expression and locomotion. It can also be argued that activation of DANs represents a teaching signal to the KCs, and then eventually, downstream of the MBONs, it results in locomotion modulation. Hence, the seeming functional diversity could be a function of different downstream neuronal pathways and not molecular context-dependent diversity inside dopaminergic neurons. The authors should address this possibility or point out the fallacy in the above argument.

      The finding that activation of TH-GAL4 conveys aversive valence and R58E02-GAL4 conveys appetitive valence seems redundant (Figure 6). I understand they say this in the context of locomotion. However, they may not have mentioned similar findings in adults. In adults, artificial activation of DANs covered by the same GAL4 lines acts as aversive and appetitive teaching signals for memory formation. These references should be cited appropriately in the results and discussion if not currently included.

      The evidence for the role of dopamine (Figure 7) can be bolstered by using other available RNAi lines against TH. A valium20 vector-based shRNA line is recommended. The current evidence is based mainly on non-specific pharmacological intervention with 3IY.

    1. eLife Assessment

      This fundamental work significantly enhances our understanding of how structural variants influence human phenotypes. The conclusion is convincingly supported by rigorous analyses of long-read sequencing data. If the raw data are made publicly available, these high-quality datasets and findings will further advance our knowledge of genetic variation in the human population.

    2. Reviewer #1 (Public review):

      Summary:

      The authors sequenced 888 individuals from the 1000 Genomes Project using the Oxford Nanopore long-read sequencing method to achieve highly sensitive, genome-wide detection of structural variants (SVs) at the population level. They conducted solid benchmarking of SV calling and systematically characterized the identified SVs. While short-read sequencing methods, including those used in the 1000 Genomes Project, have been widely applied, they exhibit high accuracy in detecting single nucleotide variants (SNVs) and small insertions and deletions but have limited sensitivity for SV detection. This study significantly enhances SV detection capabilities, establishing it as a valuable resource for human genetic research. Furthermore, the authors constructed an SV imputation panel using the generated data and imputed SVs in 488,130 individuals from the UK Biobank. They then conducted a proof-of-principle genome-wide association study (GWAS) analysis based on the imputed SVs and selected traits within the UK Biobank. Their findings demonstrate that incorporating SV-GWAS analysis provides additional insights beyond conventional GWAS frameworks focusing on SNVs, particularly in improving fine mapping.

      Strengths:

      The authors constructed a high-sensitivity reference panel of genome-wide SVs at the population level, addressing a critical gap in the field of human genetics. This resource is expected to significantly advance research in human genetics. They demonstrated the imputation of SVs in individuals from the UK Biobank using this panel and conducted a proof-of-concept SV-based GWAS. Their findings highlight a novel and effective strategy for integrating SVs into GWAS, which will facilitate the analysis of human genetic data from the UK Biobank and other datasets. Their conclusions are supported by comprehensive analyses.

      Weaknesses:

      (1) Although the authors employ state-of-the-art analytical approaches for the identification of SVs, the overall accuracy remains suboptimal, as indicated by an F1 score of 74.0%, particularly in tandem repeat regions. To enhance accuracy, it would be beneficial to explore alternative SV detection methods or develop novel approaches. Given the value of the reference panel and the fact that improved SV accuracy would lead to more precise SV imputation and GWAS results, investing effort in methodological refinement is highly encouraged.

      (2) From the Methods section, it appears that the authors employed Beagle for both the "leave-one-out" imputation and the UK Biobank imputation. It would be better to explicitly clarify this in the Results section and provide a detailed description of the corresponding procedures and parameters in the Methods section for both analyses, as this represents a key aspect of the study. Additionally, Beagle is not specifically designed for SV imputation, the imputation quality of SVs is generally lower than that of SNVs. Exploring strategies to improve SV imputation, such as developing a novel method with reference panel data, may enhance performance. It is also important to assess how this reduced imputation quality may influence GWAS results. For instance, it would be useful to examine whether associated SVs exhibit higher imputation quality and whether SVs with lower quality are less likely to achieve significant association signals. In addition, the lower imputation quality observed for INV, DUP, and BND variants (Figure 3) may be due to their greater lengths (Figure 2). It is better to investigate the relationship between SV length and imputation quality.

      (3) All examples presented in the manuscript focus on SVs that overlap with genes. It may also be valuable to investigate SVs that do not overlap with genes but intersect with enhancer regions. SVs can contribute to disease by altering regulatory elements, such as enhancers, which play a crucial role in gene expression. Including such analyses would further demonstrate the utility of SV-GWAS and provide deeper insights into the functional impact of SVs.

      (4) The data availability link currently provides only a VCF file ("sniffles2_joint_sv_calls.vcf.gz") containing the identified SVs. It would be beneficial for the authors to make all raw sequencing data (FASTQ files) and key processed datasets (such as alignment results and merged SV and SNV files) available. Providing these resources would enable other researchers to develop improved SV detection and imputation methods or conduct further genetic analyses. Furthermore, establishing a dedicated website for data access, along with a genome browser for SV visualization, could significantly enhance the impact and accessibility of the study. Additionally, all code, particularly the SV imputation pipeline accompanied by a detailed tutorial, should be deposited in a public repository such as GitHub. This would support researchers in imputing SVs and conducting SV-GWAS on their own datasets.

    3. Reviewer #2 (Public review):

      Summary:

      The authors aimed to develop a novel and efficient method for SV detection, utilizing data from the 1000 Genomes Project (1KGP) for modeling and calibration. This method was subsequently validated using UK population data and applied to identify structural variants associated with specific disease phenotypes.

      Strengths:

      Third-generation single-molecule sequencing data offers several advantages over traditional high-throughput sequencing methods, particularly due to its long-read lengths, which provide valuable insights into significant forms of genomic variation. The authors have developed an efficient method for detecting structural variations and optimizing the utilization of genomic data. We hope that this method will continue to be refined, enabling researchers to more effectively leverage long-read data, high-throughput data, or even a synergistic combination of both.

      Weaknesses:

      Although this research contributes to our ability to more effectively utilize long-length and high-throughput data, there are some key issues that need to be addressed in terms of analyzing the specific results as well as writing the article.

    4. Reviewer #3 (Public review):

      Summary:

      This study successfully identified genetic loci associated with various traits by generating large-scale long-read sequencing data from a diverse set of samples. This study is significant because it not only produces large-scale long-read genome sequencing data but also demonstrates its application in actual genetics research. Given its potential utility in various fields, this study is expected to make a valuable contribution to the academic community and to this journal. However, there are several critical aspects that could be improved. Below are specific comments for consideration.

      Strengths:

      Producing high-quality, large-scale variant datasets and imputation datasets

      Weaknesses:

      (1) Data availability

      Currently, it appears that only the Genomic Lens SV Panel is available on the webpage described in the Data Availability section. It is unclear whether the authors intend to release the raw sequencing data. Since the study utilized samples from the 1000 Genomes Project, there should be no restriction on making the data publicly accessible. Given this, would the authors consider making the raw sequencing reads publicly available? If so, NCBI SRA or EBI ENA would be the most appropriate repositories for data deposition. I strongly encourage the authors to consider public data release.

      Additionally, accessing the Genomic Lens SV Panel data does not seem straightforward. The manuscript should provide a more detailed description of how researchers can access and utilize these data. In my opinion, the best approach would be to upload the variant data (VCF files) to a public database such as the European Variation Archive (EVA) hosted by EBI.

      I strongly request that the authors publicly deposit the variant data. At a minimum:

      a) The joint genotype data for all 888 samples from the 1000 Genomes Project must be publicly available.<br /> b) For the UK Biobank samples, at least allele frequency data should be disclosed.

      Since eLife has a well-established data-sharing policy, compliance with these guidelines is essential for publication in this journal.

      (2) Long-read sequencing data quality

      While the manuscript presents N50 read length and mean or median read base quality for each sample in a table, it would be highly beneficial to visualize these data in figures as well. A violin plot or similar visualization summarizing these distributions would significantly improve data presentation.

      Notably, the base quality of ONT long-read sequencing data appears lower than expected. This may be attributed to the use of pore version 9.4.1, but the unexpectedly low base quality still warrants attention. It would be helpful to include a small figure within Figure 2 to illustrate this point. A visual representation of read length distribution and base quality distribution would strengthen the manuscript.

      (3) Variant detection precision, recall, and F1 score

      This study focuses on insertions and deletions (indels) {greater than or equal to}50 bp, but it remains unclear how well variants <50 bp are detected. I am particularly interested in the precision, recall, and F1 score for variants between 5-49 bp.

      While ONT base quality is relatively low, single-base variants are challenging to analyze, but variants {greater than or equal to}5 bp should still be detectable as their read accuracy is still approximately 90%, making analysis feasible. Given that Sniffles supports the detection of variants as small as 1 bp, I strongly encourage the authors to conduct an additional analysis.

      A simple two-category classification (e.g., 5-49 bp and {greater than or equal to}50 bp) should suffice. Additionally, a comparative analysis with HiFi and short-read sequencing data would be highly valuable. If possible, I strongly recommend that all detected variants {greater than or equal to}5 bp be made publicly available as VCF files.

      (4) Assembly-based methods

      Given the low read accuracy and low sequencing depth in this dataset, it is understandable that genome assembly is challenging. However, the latest high-quality human genome datasets-such as those produced by the Human Pangenome Reference Consortium (HPRC)-demonstrate that assembly-based approaches provide significant advantages, particularly for resolving complex and long structural variants.

      Since HPRC data also utilize 1000 Genomes Project samples, it would be highly informative to compare the accuracy of ONT sequencing in this study with HPRC's assembly-based genome data. The recent publication on 47 HPRC samples provides a valuable reference for such a comparison. Given its relevance, the authors should consider providing a comparative analysis with HPRC data.

      References:

      (1) A draft human pangenome reference<br /> https://www.nature.com/articles/s41586-023-05896-x

      (2) The Human Pangenome Project: a global resource to map genomic diversity<br /> https://www.nature.com/articles/s41586-022-04601-8

      (3) A pangenome reference of 36 Chinese populations<br /> https://www.nature.com/articles/s41586-023-06173-7

      (4) Long-read sequencing of 3,622 Icelanders provides insight into the role of structural variants in human diseases and other traits<br /> https://www.nature.com/articles/s41588-021-00865-4

      (5) Increased mutation and gene conversion within human segmental duplications<br /> https://www.nature.com/articles/s41586-023-05895-y

      (6) Structural polymorphism and diversity of human segmental duplications<br /> https://www.nature.com/articles/s41588-024-02051-8

      (7) Highly accurate Korean draft genomes reveal structural variation highlighting human telomere evolution<br /> https://academic.oup.com/nar/article/53/1/gkae1294/7945385

    1. eLife Assessment

      Pannexin (Panx) channels are a family of poorly understood large-pore channels that mediate the release of substrates like ATP from cells, yet the physiological stimuli that activate these channels remain poorly understood. The study by Henze et al. describes an elegant approach wherein activity-guided fractionation of mouse liver led to the discovery that lysophospholipids (LPCs) activate Panx1 and Panx2 channels expressed in cells or reconstituted into liposomes. The authors provide compelling evidence that LPC-mediated activation of Panx1 is involved in joint pain and that Panx1 channels are required for the established effects of LPC on inflammasome activation in monocytes, suggesting that Panx channels play a role in inflammatory pathways. Overall, this important study reports a previously unanticipated mechanism wherein LPCs directly activate Panx channels. The work will be of interest to scientists investigating phospholipids, Panx channels, purinergic signalling and inflammation.

      [Editors' note: this paper was reviewed and curated by Biophysics Colab]

    2. Joint Public Review:

      Pannexin (Panx) hemichannels are a family of heptameric membrane proteins that form pores in the plasma membrane through which ions and relatively large organic molecules can permeate. ATP release through Panx channels during the process of apoptosis is one established biological role of these proteins in the immune system, but they are widely expressed in many cells throughout the body, including the nervous system, and likely play many interesting and important roles that are yet to be defined. Although several structures have now been solved of different Panx subtypes from different species, their biophysical mechanisms remain poorly understood, including what physiological signals control their activation. Electrophysiological measurements of ionic currents flowing in response to Panx channel activation have shown that some subtypes can be activated by strong membrane depolarization or caspase cleavage of the C-terminus. Here, Henze and colleagues set out to identify endogenous activators of Panx channels, focusing on the Panx1 and Panx2 subtypes, by fractionating mouse liver extracts and screening for activation of Panx channels expressed in mammalian cells using whole-cell patch clamp recordings. The authors present a comprehensive examination with robust methodologies and supporting data that demonstrate that lysophospholipids (LPCs) directly Panx-1 and 2 channels. These methodologies include channel mutagenesis, electrophysiology, ATP release and fluorescence assays, and molecular modelling. Mouse liver extracts were initially used to identify LPC activators, but the authors go on to individually evaluate many different types of LPCs to determine those that are more specific for Panx channel activation. Importantly, the enzymes that endogenously regulate the production of these LPCs were also assessed along with other by-products that were shown not to promote pannexin channel activation. In addition, the authors used synovial fluid from canine patients, which is enriched in LPCs, to highlight the importance of the findings in pathology. Overall, we think this is likely to be an important study because it provides strong evidence that LPCs can function as activators of Panx1 and Panx2 channels, linking two established mediators of inflammatory responses and opening an entirely new area for exploring the biological roles of Panx channels. This study provides an excellent foundation for future studies and importantly provides clinical relevance.

      [Editors' note: this paper has been through two rounds of review and revisions, available here: https://sciety.org/articles/activity/10.1101/2023.10.23.563601]

    3. Author response:

      (This author response relates to the first round of peer review by Biophysics Colab. Reviews and responses to both rounds of review are available here: https://sciety.org/articles/activity/10.1101/2023.10.23.563601.)

      General Assessment:

      Pannexin (Panx) hemichannels are a family of heptameric membrane proteins that form pores in the plasma membrane through which ions and relatively large organic molecules can permeate. ATP release through Panx channels during the process of apoptosis is one established biological role of these proteins in the immune system, but they are widely expressed in many cells throughout the body, including the nervous system, and likely play many interesting and important roles that are yet to be defined. Although several structures have now been solved of different Panx subtypes from different species, their biophysical mechanisms remain poorly understood, including what physiological signals control their activation. Electrophysiological measurements of ionic currents flowing in response to Panx channel activation have shown that some subtypes can be activated by strong membrane depolarization or caspase cleavage of the C-terminus. Here, Henze and colleagues set out to identify endogenous activators of Panx channels, focusing on the Panx1 and Panx2 subtypes, by fractionating mouse liver extracts and screening for activation of Panx channels expressed in mammalian cells using whole-cell patch clamp recordings. The authors present a comprehensive examination with robust methodologies and supporting data that demonstrate that lysophospholipids (LPCs) directly Panx-1 and 2 channels. These methodologies include channel mutagenesis, electrophysiology, ATP release and fluorescence assays, molecular modelling, and cryogenic electron microscopy (cryo-EM). Mouse liver extracts were initially used to identify LPC activators, but the authors go on to individually evaluate many different types of LPCs to determine those that are more specific for Panx channel activation. Importantly, the enzymes that endogenously regulate the production of these LPCs were also assessed along with other by-products that were shown not to promote pannexin channel activation. In addition, the authors used synovial fluid from canine patients, which is enriched in LPCs, to highlight the importance of the findings in pathology. Overall, we think this is likely to be a landmark study because it provides strong evidence that LPCs can function as activators of Panx1 and Panx2 channels, linking two established mediators of inflammatory responses and opening an entirely new area for exploring the biological roles of Panx channels. Although the mechanism of LPC activation of Panx channels remains unresolved, this study provides an excellent foundation for future studies and importantly provides clinical relevance.

      We thank the reviewers for their time and effort in reviewing our manuscript. Based on their valuable comments and suggestions, we have made substantial revisions. The updated manuscript now includes two new experiments supporting that lysophospholipid-triggered channel activation promotes the release of signaling molecules critical for immune response and demonstrates that this novel class of agonist activates the inflammasome in human macrophages through endogenously expressed Panx1. To better highlight the significance of our findings, we have excluded the cryo-EM panel from this manuscript. We believe these changes address the main concerns raised by the reviewers and enhance the overall clarity and impact of our findings. Below, we provide a point-by-point response to each of the reviewers’ comments.

      Recommendations:

      (1) The authors present a tremendous amount of data using different approaches, cells and assays along with a written presentation that is quite abbreviated, which may make comprehension challenging for some readers. We would encourage the authors to expand the written presentation to more fully describe the experiments that were done and how the data were analysed so that the 2 key conclusions can be more fully appreciated by readers. A lot of data is also presented in supplemental figures that could be brought into the main figures and more thoroughly presented and discussed.

      We appreciate and agree with the reviewers’ observation. Our initial manuscript may have been challenging to follow due to our use of both wild-type and GS-tagged versions of Panx1 from human and frog origins, combined with different fluorescence techniques across cell types. In this revision, we used only human wild-type Panx1 expressed in HEK293S GnTI- cells, except for activity-guided fractionation experiments, where we used GS-tagged Panx1 expressed in HEK293 cells (Fig. 1). For functional reconstitution studies, we employed YO-PRO-1 uptake assays, as optimizing the Venus-based assay was challenging. We have clarified these exceptions in the main text. We think these adjustments simplify the narrative and ensure an appropriate balance between main and supplemental figures.

      (2) It would also be useful to present data on the ion selectivity of Panx channels activated by LPC. How does this compare to data obtained when the channel is activated by depolarization? If the two stimuli activate related open states then the ion selectivity may be quite similar, but perhaps not if the two stimuli activate different open states. The authors earlier work in eLife shows interesting shifts in reversal potentials (Vrev) when substituting external chloride with gluconate but not when substituting external sodium with N-methyl-D-glucamine, and these changed with mutations within the external pore of Panx channels. Related measurements comparing channels activated by LPC with membrane depolarization would be valuable for assessing whether similar or distinct open states are activated by LPC and voltage. It would be ideal to make Vrev measurements using a fixed step depolarization to open the channel and then various steps to more negative voltages to measure tail currents in pinpointing Vrev (a so called instantaneous IV).

      We fully agree with the reviewer on the importance of ion selectivity experiments. However, comparing the properties of LPC-activated channels with those activated by membrane depolarization presented technical challenges, as LPC appears to stimulate Panx1 in synergy with voltage. Prolonged LPC exposure destabilizes patches, complicating G-V curve acquisition and kinetic analyses. While such experiments could provide mechanistic insights, we think they are beyond the scope of current study.

      (3) Data is presented for expression of Panx channels in different cell types (HEK vs HEKS GnTI-) and different constructs (Panx1 vs Panx1-GS vs other engineered constructs). The authors have tried to be clear about what was done in each experiment, but it can be challenging for the reader to keep everything straight. The labelling in Fig 1E helps a lot, and we encourage the authors to use that approach systematically throughout. It would also help to clearly identify the cell type and channel construct whenever showing traces, like those in Fig 1D. Doing this systematically throughout all the figures would also make it clear where a control is missing. For example, if labelling for the type of cell was included in Fig 1D it would be immediately clear that a GnTI- vector alone control for WT Panx1 is missing as the vector control shown is for HEK cells and formally that is only a control for Panx2 and 3. Can the authors explain why PLC activates Panx1 overexpressed in HEK293 GnTl- cells but not in HEK293 cells? Is this purely a function of expression levels? If so, it would be good to provide that supporting information.

      As mentioned above, we believe our revised version is more straightforward to digest. We have improved labeling and provided explanations where necessary to clarify the manuscript. While Panx1 expression levels are indeed higher in GnTI- than in HEK293 cells, we are uncertain whether the absence of detectable currents in HEK293 cells is solely due to expression levels. Some post-translational modifications that inhibit Panx1, such as lysine acetylation, may also impact activity. Future studies are needed to explore these mechanisms further.

      (4) The mVenus quenching experiments are somewhat confusing in the way data are presented. In Fig 2B the y axis is labelled fluorescence (%) but when the channel is closed at time = 0 the value of fluorescence is 0 rather than 100 %, and as the channel opens when LPC is added the values grow towards 100 instead of towards 0 as iodide permeates and quenches. It would be helpful if these types of data could be presented more intuitively. Also, how was the initial rate calculated that is plotted in Fig 2C? It would be helpful to show how this is done in a figure panel somewhere. Why was the initial rate expressed as a percent maximum, what is the maximum and why are the values so low? Why is the effect of CBX so weak in these quenching experiments with Panx1 compared to other assays? This assay is used in a lot of experiments so anything that could be done to bolster confidence is what it reports on would be valuable to readers. Bringing in as many control experiments that have been done, including any that are already published, would be helpful.

      We modified the Y-axis in Figure 2 to “Quench (%)” for clarity. The data reflects fluorescence reduction over time, starting from LPC addition, normalized to the maximal decrease observed after Triton-X100 addition (3 minutes), enabling consistent quenching value comparisons. Although the quenching value appears small, normalization against complete cell solubilization provides reproducible comparisons. We do not fully understand why CBX effects vary in Venus quenching experiments, but we speculate that its steroid-like pentacyclic structure may influence the lysophospholipid agonistic effects. As noted in prior studies (DOI: 10.1085/jgp.201511505; DOI: 10.7554/eLife.54670), CBX likely acts as an allosteric modulator rather than a simple pore blocker, potentially contributing to these variations.

      (5) Could provide more information to help rationalize how Yo-Pro-1, which has a charge of +2, can permeate what are thought to be anion favouring Panx channels? We appreciate that the biophysical properties of Panx channel remain mysterious, but it would help to hear how a bit more about the authors thinking. It might also help to cite other papers that have measured Yo-Pro-1 uptake through Panx channels. Was the Strep-tagged construct of Panx1 expressed in GnTI- cells and shown to be functional using electrophysiology?

      Our recent study suggest that the electrostatic landscape along the permeation pathway may influence its ion selectivity (DOI: 10.1101/2024.06.13.598903). However, we have not yet fully elucidated how Panx1 permeates both anions and cations. Based on our findings, ion selectivity may vary with activation stimulus intensity and duration. Cation permeation through Panx1 is often demonstrated with YO-PRO-1, which measures uptake over minutes, unlike electrophysiological measurements conducted over milliseconds to seconds. We referenced two representative studies employing YO-PRO-1 to assess Panx1 activity. Whole-cell current measurements from a similar construct with an intracellular loop insertion indicate that our STREP-tagged construct likely retains functional capacity.

      (6) In Fig 5 panel C, data is presented as the ratio of LPC induced current at -60 mV to that measured at +110 mV in the absence of LPC. What is the rationale for analysing the data this way? It would be helpful to also plot the two values separately for all of the constructs presented so the reader can see whether any of the mutants disproportionately alter LPC induced current relative to depolarization activated current. Also, for all currents shown in the figures, the authors should include a dashed coloured line at zero current, both for the LPC activated currents and the voltage steps.

      We used the ratio of LPC-induced current to the current measured at +110 mV to determine whether any of the mutants disproportionately affect LPC-induced current relative to depolarization-activated current. Since the mutants that did not respond to LPC also exhibited smaller voltage-stimulated currents than those that did respond, we reasoned that using this ratio would better capture the information the reviewer is suggesting to gauge. Showing the zero current level may be helpful if the goal was to compare basal currents, which in our experience vary significantly from patch to patch. However, since we are comparing LPC- and voltage-induced currents within the same patch, we believe that including basal current measurements would not add useful information to our study.

      Given that new experiments included to further highlight the significance of the discovery of Panx1 agonists, we opted to separate structure-based mechanistic studies from this manuscript and removed this experiment along with the docking and cryo-EM studies.

      (7) The fragmented NTD density shown in Fig S8 panel A may resemble either lipid density or the average density of both NTD and lipid. For example, Class7 and Class8 in Fig.S8 panel D displayed split densities, which may resemble a phosphate head group and two tails of lipid. A protomer mask may not be the ideal approach to separate different classes of NTD because as shown in Fig S8 panel D, most high-resolution features are located on TM1-4, suggesting that the classification was focused on TM1-4. A more suitable approach would involve using a smaller mask including NTD, TM1, and the neighbouring TM2 region to separate different NTD classes.

      We agree with the reviewer and attempted 3D classification using multiple smaller masks including the suggested region. However, the maps remained poorly defined, and we were unable to confidently assign the NTD.

      (8) The authors don’t discuss whether the LPC-bound structures display changes in the external part of the pore, which is the anion-selective filter and the narrower part of the pore. If there are no conformational changes there, then the present structures cannot explain permeability to large molecules like ATP. In this context, a plot for the pore dimension will be helpful to see differences along the pore between their different structures. It would also be clearer if the authors overlaid maps of protomers to illustrate differences at the NTD and the "selectivity filter."

      Both maps show that the narrowest constriction, formed by W74, has a diameter of approximately 9 Å. Previous steered molecular dynamics simulations suggest that ATP can permeate through such a constriction, implying an ion selection mechanism distinct from a simple steric barrier.

      (9) The time between the addition of LPC to the nanodisc-reconstituted protein and grid preparation is not mentioned. Dynamic diffusion of LPC could result in equal probabilities for the bound and unbound forms. This raises the possibility of finding the Primed state in the LPC-bound state as well. Additionally, can the authors rationalize how LPC might reach the pore region when the channel is in the closed state before the application of LPC?

      We appreciate the reviewer’s insight. We incubated LPC and nanodisc-reconstituted protein for 30 minutes, speculating that LPC approaches the pore similarly to other lipids in prior structures. In separate studies, we are optimizing conditions to capture more defined conformations.

      (10) In the cryo-EM map of the “resting” state (EMDB-21150), a part of the density was interpreted as NTD flipped to the intracellular side. This density, however, is poorly defined, and not connected to the S1 helix, raising concerns about whether this density corresponds to the NTD as seen in the “resting” state structure (PDB-ID: 6VD7). In addition, some residues in the C-terminus (after K333 in frog PANX1) are missing from the atomic model. Some of these residues are predicted by AlphaFold2 to form a short alpha helix and are shown to form a short alpha helix in some published PANX1 structures. Interestingly, in both the AF2 model and 6WBF, this short alpha helix is located approximately in the weak density that the authors suggest represents the “flipped” NTD. We encourage the authors to be cautious in interpreting this part as the “flipped” NTD without further validation or justification.

      We agree that the density corresponding the extended NTD into the cytoplasm is relatively weak. In our recent study, we compared two Panx1 structures with or without the mentioned C-terminal helix and found evidence suggesting the likelihood of NTD extension (DOI: 10.1101/2024.06.13.598903). Nevertheless, to prevent potential confusion, we have removed the cryo-EM panel from this manuscript.

      (11) Since the authors did not observe densities of bound PLC in the cryo-EM map, it is important to acknowledge in the text the inherent limitations of using docking and mutagenesis methods to locate where PLC binds.

      Thank you for the suggestion. We have removed this section to avoid potential confusion.

      Optional suggestions:

      (1) The authors used MeOH to extract mouse liver for reversed-phase chromatography. Was the study designed to focus on hydrophobic compounds that likely bind to the TMD? Panx1 has both ECD and ICD with substantial sizes that could interact with water soluble compounds? Also, the use of whole-cell recordings to screen fractions would not likely identify polar compounds that interact with the cytoplasmic part of the TMD? It would be useful for the authors to comment on these aspects of their screen and provide their rationale for fractionating liver rather than other tissues.

      We have added a rationale in line 90, stating: “The soluble fractions were excluded from this study, as the most polar fraction induced strong channel activities in the absence of exogenously expressed pannexins.” Additionally, we have included a figure to support this rationale (Fig. S1A).

      (2) The authors show that LPCs reversibly increase inward currents at a holding voltage of -60 mV (not always specified in legends) in cells expressing Panx1 and 2, and then show families of currents activated by depolarizing voltage steps in the absence of LPC without asking what happens when you depolarize the membrane after LPC activation? If LPCs can be applied for long enough without disrupting recordings, it would be valuable to obtain both I-V relations and G-V relations before and after LPC activation of Panx channels. Does LPC disproportionately increase current at some voltages compared to others? Is the outward rectification reduced by LPC? Does Vrev remain unchanged (see point above)? Its hard to predict what would be observed, but almost any outcome from these experiments would suggest additional experiments to explore the extent to which the open states activated by LPC and depolarization are similar or distinct.

      Unfortunately, in our hands, the prolonged application of lysolipids at concentrations necessary to achieve significant currents tends to destabilize the patch. This makes it challenging to obtain G-V curves or perform the previously mentioned kinetic analyses. We believe this destabilization may be due to lysolipids’ surfactant-like qualities, which can disrupt the giga seal. Additionally, prolonged exposure seems to cause channel desensitization, which could be another confounding factor.

      (3) From the results presented, the authors cannot rule out that mutagenesis-induced insensitivity of Panx channels to LPCs results from allosteric perturbations in the channels rather than direct binding/gating by LPCs. In Fig 5 panel A-C, the authors introduced double mutants on TM1 and TM2 to interfere with LPC binding, however, the double mutants may also disrupt the interaction network formed within NTD, TM1, and TM2. This disruption could potentially rearrange the conformation of NTD, favouring the resting closed state. Three double Asn mutants, which abolished LPC induced current, also exhibited lower currents through voltage activation in Fig 5S, raising the possibility the mutant channels fail to activate in response to LPC due to an increased energy barrier. One way to gain further insight would be to mutate residues in NTD that interact with those substituted by the three double Asn mutants and to measuring currents from both voltage activation and LPC activation. Such results might help to elucidate whether the three double Asn mutants interfere with LPC binding. It would also be important to show that the voltage-activated currents in Fig. S5 are sensitive to CBX?

      Thank you for the comment, with which we agree. Our initial intention was to use the mutagenesis studies to experimentally support the docking study. Due to uncertainties associated with the presented cryo-EM maps, we have decided to remove this study from the current manuscript. We will consider the proposed experiments in a future study.

      (4) Could the authors elaborate on how LPC opens Panx1 by altering the conformation of the NTDs in an uncoordinated manner, going from “primed” state to the “active” state. In the “primed” state, the NTDs seem to be ordered by forming interactions with the TMD, thus resulting in the largest (possible?) pore size around the NTDs. In contrast, in the “active” state, the authors suggest that the NTDs are fragmented as a result of uncoordinated rearrangement, which conceivably will lead to a reduction in pore size around NTDs (isn’t it?). It is therefore not intuitive to understand why a conformation with a smaller pore size represents an “active” state.

      We believe the uncoordinated arrangement of NTDs is dynamic, allowing for potential variations in pore size during the activated conformation. Alternatively, NTD movement may be coupled with conformational changes in TM1 and the extracellular domain, which in turn could alter the electrostatic properties of the permeation pathway. We believe a functional study exploring this mechanism would be more appropriately presented as a separate study.

      (5) Can the authors provide a positive control for these negative results presented in Fig S1B and C?

      The positive results are presented in Fig. 1D and E.

      (6) Raw images in Fig S6 and Fig S7 should contain units of measurement.

      Thank you for pointing this out.

      (7) It may be beneficial to show the superposition between primed state and activated state in both protomer and overall structure. In addition, superposition between primed state and PDB 7F8J.

      We attempted to superimpose the cryo-EM maps; however, visually highlighting the differences in figure format proved challenging. Higher-resolution maps would allow for model building, which would more effectively convey these distinctions.

      (8) Including particles number in each class in Fig S8 panel C and D would help in evaluating the quality of classification.

      Noted.

      (9) A table for cryo-EM statistics should be included.

      Thanks, noted.

      (10) n values are often provided as a range within legends but it would be better to provide individual values for each dataset. In many figures you can see most of the data points, which is great, but it would be easy to add n values to the plots themselves, perhaps in parentheses above the data points.

      While we agree that transparency is essential, adding n-values to each graph would make some figures less clear and potentially harder to interpret in this case. We believe that the dot plots, n-value range, and statistical analysis provide adequate support for our claims.

      (11) The way caspase activation of Panx channels is presented in the introduction could be viewed as dismissive or inflammatory for those who have studied that mechanism. We think the caspase activation literature is quite convincing and there is no need to be dismissive when pointing out that there are good reasons to believe that other mechanisms of activation likely exist. We encourage you to revise the introduction accordingly.

      Thank you for this comment. Although we intended to support the caspase activation mechanism in our introduction, we understand that the reviewer’s interpretation indicates a need for clarification. We hope the revised introduction removes any perception of dismissiveness.

      (12) Why is the patient data in Fig 4F normalized differently than everything else? Once the above issues with mVenus quenching data are clarified, it would be good to be systematic and use the same approach here.

      For Fig. 4F, we used a distinct normalization method to account for substantial day-to-day variation in experiments involving body fluids. Notably, we did not apply this normalization to other experimental panels due to their considerably lower day-to-day variation.

      (13) What was the rational for using the structure from ref 35 in the docking task?

      The docking task utilized the human orthologue with a flipped-up NTD. We believe that this flipped-up conformation is likely the active form that responds to lysolipids. As our functional experiments primarily use the human orthologue for biological relevance, this structure choice is consistent. Our docking data shows that LPC does not dock at this site when using a construct with the downward-flipped NTD.

      (14) Perhaps better to refer to double Asn ‘substitutions’ rather than as ‘mutations’ because that makes one think they are Asn in the wt protein.

      Done.

      (15) From Fig S1, we gather that Panx2 is much larger than Panx1 and 3. If that is the case, its worth noting that to readers somewhere.

      We have added the molecular weight of each subtype in the figure legend.

      (16) Please provide holding voltages and zero current levels in all figures presenting currents.

      We provided holding voltages. However, the zero current levels vary among the examples presented, making direct comparisons difficult. Since we are comparing currents with and without LPC, we believe that indicating zero current levels is unnecessary for this study.

      (17) While the authors successfully establish lysophospholipid-gating of Panx1 and Panx2, Panx3 appears unaffected. It may be advisable to be more specific in the title of the article.

      We are uncertain whether Panx3 is unaffected by lysophospholipids, as we have not observed activation of this subtype under any tested conditions.

    1. eLife Assessment

      This descriptive study used multiparameter spectral flow cytometry and clustering analysis of a subset of CD4 T cells, termed circulating T follicular helper (cTfh), responding to Plasmodium falciparum antigens, PfSEA -1A and PfGARP. The results from this comprehensive study provide valuable information regarding differences in cTfh response profiles between children and adults living in malaria-endemic Kenya and thus offer a potential usefulness towards improving choices of antigen candidates for malaria vaccines. However, the analysis and interpretation of antigen-specific CD4 cTfh responses remain incomplete.

    2. Reviewer #1 (Public review):

      Summary:

      This study aims to understand the malaria antigen-specific cTfh profile of children and adults living in malaria holoendemic area. PBMC samples from children and adults were unstimulated or stimulated with PfSEA-1A or PfGARP in vitro for 6h and analysed by a cTfh-focused panel. Unsupervised clustering and analysis on cTfh was performed. The main conclusions are: A) the children cohort has a more diverse (cTfh1/2/17) recall responses compared to adults (mainly cTfh17) and, B) Pf-GARP stimulates better cTfh17 responses in adults, thus a promising vaccine candidate.

      Strengths:

      This study is, in general, well-designed and with excellent data analysis. The use of unsupervised clustering is a nice attempt to understand the heterogeneity of cTfh cells.

      Weaknesses:

      The authors have provided additional data in Supplementary Figures 14-16. However, I remain concerned about whether cTfh cells are truly responding to antigen stimulation. In Supplementary Figure 15A-F, the IFNg responses appear as expected, SEB elicits the strongest response, as it stimulates bulk T cells, and the staining is promising, showing a clear distinction between IFNg+ and IFNg- populations. However, in Supplementary Figure 15I-N, the IL-21 secretion assay is concerning. The FACS plots make it difficult to distinguish IL-21+ from IL-21- cells, raising concerns about the validity of this analysis. Additionally, in panel J, the responses to PfSEA-1A or PfGARP appear even greater than those to SEB stimulation. In PBMCs, only a small percentage of T cells should be specific to a particular antigen. How can the positive control (SEB) produce a weaker response than stimulation with a specific antigen? This suggests that the IL-21 secretion assay may not have worked, making the authors' interpretation unreliable.

      I also have similar concerns about the IL-4 secretion in Sup Figure 16. First, the FACS plot shows that appear double-positive for IL-21 and IL-4, so it suggests the staining may be due to autofluorescence rather than true cytokine signals. Also in B-C the responses of SEB stimulation is generally weaker than stimulated by one antigen, further questioning the reliability of the IL-4 assay. In summary, I am not convinced that the in vitro antigen stimulation assay worked as intended. Consequently, the manuscript's claims regarding PfSEA-1A- and PfGARP-specific cTfh responses are not sufficiently supported by the presented data.

    3. Reviewer #3 (Public review):

      Summary:

      The goal of this study was to carry out an in-depth granular and unbiased phenotyping of peripheral blood circulating Tfh specific to two malaria vaccine candidates, PfSEA-1A and PfGARP, and correlate these with age (children vs adults) and protection from malaria (antibody titers against Plasmodium antigens.) Authors further attempted to identify any specific differences of the Tfh responses to these two distinct malaria antigens.

      Strengths:

      The authors had access to peripheral blood samples from children and adults living in a malaria-endemic region of Kenya. The authors studied these samples using in vitro restimulation in the presence of specific malaria antigens. Authors generated a very rich data set from these valuable samples using cutting-edge spectral flow cytometry and a 21-plex panel that included a variety of surface markers, cytokines and transcription factors.

      Update following first revision (R1) of the manuscript:

      The authors have made a great effort to comprehensively address comments raised by the reviewers. In particular, clearly showing expression of ICOS and Bcl6 on CXCR5+ cells greatly strengthens the case for defining these cells as Tfh-like circulatory lymphocytes (cTfh).

      Weaknesses:

      Update following first revision (R1) of the manuscript:

      Unfortunately, my main concern remains. As it stands, the study is not really on antigen-specific T cells, but rather on the overall CD4 T cell compartment plus or minus antigenic stimulation. Although authors used an in vitro restimulation strategy with malaria antigens, they do not focus on cells de-novo expressing activation markers as a result of restimulation, neither they use tetramers to detect antigen-specific T cells. Moreover, their data shows that the number of CXCR5+ CD4 T cells de-novo expressing activation markers and/or cytokines as a result of their in vitro restimulation is negligible, even when using a prototypic superantigen (SEB).

      Thus, no antigen-specific CXCR5+ CD4 T cells could be analysed with the data that the authors provide in this manuscript.

    4. Reviewer #4 (Public review):

      Summary:

      This manuscript is a descriptive study of circulating T follicular helper (cTfh) responses to PfSEA -1A or PfGARP (targets of new antimalaria vaccine candidates) in PBMCs from a convenience sample of children (7 yrs of age) and adults living in a malaria holo endemic Kenya using multiparameter flow cytometry and clustering analysis. This cell type promotes B cell production of long-lived antimalarial antibodies to provide protection against malaria. They find that children had a wider cTFH cytokine and TF profile cellular response in comparison to adults who responded to both antigens but had a narrower response profile.

      Strengths:

      Carefully done study, very detailed, nice summary model at the end of the paper. The revision provides requested clarification on a number of issues, including CD40L expression which was not differentially expressed between groups. They add additional data into the supplemental files, including IL4 and IL21 data by presenting the cytoplots.

      Weaknesses:

      To know the significance of these cTfh cells for long-term protection of malaria requires functional and transfer experiments in animal models which is outside the scope of this work.

    1. eLife Assessment

      Understanding bacterial growth mechanisms potentially uncover novel drug targets which are crucial for maintaining cellular viability, particularly for bacterial pathogens. In this important study, Kapoor et al, investigate the role of Wag31 in lipid and peptidoglycan biosynthesis in mycobacteria. A detailed analysis of Wag31 domain architecture revealed a role in membrane tethering. More specifically, the N-terminal and C-terminal domains appeared to have distinct functional roles. The data presented are solid and support the conclusion made. This study will be of broad interest to microbiologists and molecular biologists.

    2. Reviewer #1 (Public review):

      This is a comprehensive study that sheds light on how Wag31 functions and localises in mycobacterial cells. A clear link to interactions with CL is shown using a combination of microscopy in combination with fusion fluorescent constructs, and lipid specific dyes. Furthermore, studies using mutant versions of Wag31 shed light on the functionalities of each domain in the protein. My concerns/suggestions for the manuscript are minor:

      (1) Ln 130. A better clarification/discussion is required here. It is clear that both depletion and overexpression have an effect on levels of various lipids, but subsequent descriptions show that they affect different classes of lipids.<br /> (2) The pulldown assays results are interesting, but the links are tentative.<br /> (3) The authors may perhaps like to rephrase claims of effects lipid homeostasis, as my understanding is that lipid localisation rather than catabolism/breakdown is affected.

      In response to the above reviews the authors have made the required changes in the revised manuscript.

    3. Reviewer #2 (Public review):

      Summary:

      Kapoor et. al. investigated the role of the mycobacterial protein Wag31 in lipid and peptidoglycan synthesis and sought to delineate the role of the N- and C- terminal domains of Wag31. They demonstrated that modulating Wag31 levels influences lipid homeostasis in M. smegmatis and cardiolipin (CL) localisation in cells. Wag31 was found to preferentially bind CL-containing liposomes, and deleting the N-terminus of the protein significantly decreased this interaction. Novel interactions between Wag31 and proteins involved in lipid metabolism and cell wall synthesis were identified, suggesting that Wag31 recruits proteins to the intracellular membrane domain by direct interaction.

      Strengths:

      (1) The importance of Wag31 in maintaining lipid homeostasis is supported by several lines of evidence.<br /> (2) The interaction between Wag31 and cardiolipin, and the role of the N-terminus in this interaction was convincingly demonstrated.

      Weakness:

      (1) Interactome analysis with truncated versions of the proteins could not be performed in M. smegmatis due to protein instability.

    4. Reviewer #3 (Public review):

      Summary:

      This manuscript describes the characterization of mycobacterial cytoskeleton protein Wag31, examining its role in orchestrating protein-lipid and protein-protein interactions essential for mycobacterial survival. The most significant finding is that Wag31, which directs polar elongation and maintains the intracellular membrane domain, was revealed to have membrane tethering capabilities.

      Strengths:

      The authors provided a detailed analysis of Wag31 domain architecture, revealing distinct functional roles: the N-terminal domain facilitates lipid binding and membrane tethering, while the C-terminal domain mediates protein-protein interactions. Overall, this study offers a robust and new understanding of Wag31 function.

      Weaknesses:

      The authors did not address some of the comments. The following concerns should be addressed.

      • As far as I can tell, authors did not address my prior comments on Line 270, which is Line 280 in the revised manuscript: the N-terminal region is important for lipid homeostasis, but the statement in Line 270, "the maintenance of lipid homeostasis by Wag31 is a consequence of its tethering activity" requires additional proof. Please indicate the page and line numbers in the revised manuscript so that I can identify the specific changes the authors made.

      • Since this pull-down assay was conducted by mixing E. coli lysate expressing Wag31 and Msm lysate expression Wag31 interactors like MurG, it is possible that the interactions are not direct. Authors acknowledge that this is a valid point, and indicated that they "will describe this caveat in the revised manuscript". I have difficulty finding where this revision was made. Please indicate the page and line numbers.

    5. Author response:

      The following is the authors’ response to the original reviews

      Public Reviews:

      Reviewer #1 (Public review):

      This a comprehensive study that sheds light on how Wag31 functions and localises in mycobacterial cells. A clear link to interactions with CL is shown using a combination of microscopy in combination with fusion fluorescent constructs, and lipid specific dyes. Furthermore, studies using mutant versions of Wag31 shed light on the functionalities of each domain in the protein. My concerns/suggestions for the manuscript are minor:

      (1) Ln 130. A better clarification/discussion is required here. It is clear that both depletion and overexpression have an effect on levels of various lipids, but subsequent descriptions show that they affect different classes of lipids.

      We thank the reviewer for the comment. We have added a better clarification on this in the discussion of revised manuscript. The lipid classes that get impacted by the depletion of Wag31 vs overexpression are different. Wag31 is an adaptor protein that interacts with proteins of the ACCase complex (Meniche et al., 2014; Xu et al., 2014) that synthesize fatty acid precursors and regulate their activity (Habibi Arejan et al., 2022).

      The varied response on lipid homeostasis could be attributed to a change in the stoichiometry of these interactions of Wag31. While Wag31 depletion would prevent such interactions from occurring and might affect lipid synthesis that directly depends on Wag31-protein partner interactions, its overexpression would lead to promiscuous interactions and a change in the stoichiometry of native interactions that would ultimately modulate lipid synthesis pathways.

      (2) The pulldown assays results are interesting, but links are tentative.

      We thank the reviewer for the comment. The interactome of Wag31 was identified through the immunoprecipitation of FLAG-Wag31 complemented at an integrative locus in Wag31 mutant background to avoid overexpression artifacts. We used Msm::gfp expressing an integrative copy (at L5 locus) of FLAG-GFP as a control to subtract non-specific interactions. The experiment was performed in biological triplicates, and interactors that appeared in all replicates but not in the control were selected for further analysis. Although we identified more than 100 interactors of Wag31, we analyzed only the top 25 hits, with a PSM cut-off 18 and unique peptides5. Additionally, two of Wag31's established interactors, AccD5 and Rne, were among the top five hits, thus validating our data.

      As mentioned in line 139 of the previous version of the manuscript, we agree that the interactions can either be direct or through a third partner. The fact that we obtained known interactors of Wag31 makes us believe these interactions are genuine. Moreover, for validation, we performed pulldown experiments by mixing E. coli lysates expressing His-Wag31 full-length or truncated protein with M. smegmatis lysates expressing FLAG-tagged interacting proteins. The wash conditions used were quite stringent for these pull-down assays—the wash buffer contained 1% Triton X100 that eliminates all non-specific and indirect interactions. However, we agree that we cannot conclusively state that the interactions are direct without purifying the proteins and performing the experiment. As mentioned above, this caveat was stated in the previous version of the manuscript.

      (3) The authors may perhaps like to rephrase claims of effects lipid homeostasis, as my understanding is that lipid localisation rather than catabolism/breakdown is affected.

      We thank the reviewer for the comment. In this manuscript, we are trying to convey that Wag31 is a spatiotemporal regulator of lipid metabolism. It is a peripheral protein that is hooked to the membrane via Cardiolipin and forms a scaffold at the poles, which helps localize several enzymes involved in lipid metabolism.

      Homeostasis is the process by which an organism maintains a steady-state of balance and stability in response to changes. Depletion of Wag31 not only results in delocalisation of lipids in intracellular lipid inclusions but also leads to changes in the levels of various lipid classes. Advancement in the field of spatial biology underscores the importance of native localization of various biological molecules crucial for maintaining a steady-cell of the cell. Hence, we have used the word “homeostasis” to describe both the changes observed in lipid metabolism.

      Reviewer #2 (Public review):

      Summary:

      Kapoor et. al. investigated the role of the mycobacterial protein Wag31 in lipid and peptidoglycan synthesis and sought to delineate the role of the N- and C- terminal domains of Wag31. They demonstrated that modulating Wag31 levels influences lipid homeostasis in M. smegmatis and cardiolipin (CL) localisation in cells. Wag31 was found to preferentially bind CL-containing liposomes, and deleting the N-terminus of the protein significantly decreased this interaction. Novel interactions between Wag31 and proteins involved in lipid metabolism and cell wall synthesis were identified, suggesting that Wag31 recruits proteins to the intracellular membrane domain by direct interaction.

      Strengths:

      (1) The importance of Wag31 in maintaining lipid homeostasis is supported by several lines of evidence. (2) The interaction between Wag31 and cardiolipin, and the role of the N-terminus in this interaction was convincingly demonstrated.

      Weaknesses:

      (1) MS experiments provide some evidence for novel protein-protein interactions. However, the pulldown experiments lack a valid negative control.

      We thank the reviewer for the comment. We have included two non-interactors of Wag31 i.e. MmpL4 and MmpS5 which were not identified in our interactome database as negative controls in the experiment. As shown in Figure S3, we performed His pull-down experiments with both of them independently twice, each time with a positive control (known interactor of Wag31 (Msm2092)). Fig. S3b revised shows E. coli lysate expressing His-Wag31 which was incubated with Msm lysates expressing either FLAG tagged-MmpL4 or -MmpS5 or Msm2092 (revised Fig. S3c). The mixed lysates were pulled down with Cobalt beads that bind to the His-tagged protein and analysed using Western blot analysis by probing with anti-FLAG antibody (revised Fig. S3d.). The data presented confirms that the interactions validated through the pull down assay were indeed specific.

      (2) The role of the N-terminus in the protein-protein interaction has not been ruled out.

      We thank the reviewer for the comment. Wag31<sub>Msm</sub> is a 272 amino acids long protein. The Nterminal of Wag31, which houses the DivIVA-domain, comprises the first 60 amino acids. Previously, we attempted to express the N-terminal (60 aa long) and the C-terminal (212 aa long) truncated proteins in various mycobacterial shuttle vectors to perform MS/MS experiments. Despite numerous efforts, neither expressed with the N/C-terminal FLAG tag or no tag in episomal or integrative vectors due to instability of the protein. Eventually, we successfully expressed the C-terminal Wag31 with an N and Cterminal hexa-His tag. However, this expression was not sufficient or stable enough for us to perform Ni<sup>2+</sup>-affinity pull-down experiments for mass spectrometry. N-terminal of Wag31 could not be expressed in M. smegmatis even with N and C-terminal Hexa-His tags.

      To rule out the role of the N-terminal in mediating protein-protein interactions, we cloned the N-terminal of Wag31 that comprises the DivIVA-domain in pET28b vector (Fig. 7a revised). Subsequently, the truncated protein, hereafter called  Wag31<sub>∆C</sub>  flanked by 6X His tags at both the termini was expressed in E. coli and mixed with Msm lysates expressing interactors of Wag31 (Fig. 7b-c revised). Earlier experiments with Wag31<sub>∆1-60</sub or Wag31<sub>∆N</sub> (in the revised manuscript) were performed with MurG, SepIVA, Msm2092 and AccA3 (Fig. 7e-g). Thus, we used the same set of interactors to test our hypothesis. Briefly, His-  Wag31<sub>∆C</sub>  was mixed with Msm lysates expressing either FLAG-MurG, -SepIVA, -Msm2092 or -AccA3 and pull down experiments were performed as described previously. FLAGMmpS5, a non-interactor of Wag31 was used as a negative control. As shown in Fig. 7d revised, His-Wag31 could bind to all the four interactors whereas His- Wag31<sub>∆C</sub>  couldn’t, strengthening the conclusion that interactions of Wag31 with other proteins are mediated by its Cterminal. However, we can’t ignore the possibility of other interactors binding to the N-terminal of Wag31. Unfortunately, due to poor expression/instability of  Wag31<sub>∆C</sub>  in mycobacterial shuttle vectors, we are unable to perform a global interactome analysis of  Wag31<sub>∆C</sub>

      Reviewer #3 (Public review):

      Summary:

      This manuscript describes the characterization of mycobacterial cytoskeleton protein Wag31, examining its role in orchestrating protein-lipid and protein-protein interactions essential for mycobacterial survival. The most significant finding is that Wag31, which directs polar elongation and maintains the intracellular membrane domain, was revealed to have membrane tethering capabilities.

      Strengths:

      The authors provided a detailed analysis of Wag31 domain architecture, revealing distinct functional roles: the N-terminal domain facilitates lipid binding and membrane tethering, while the C-terminal domain mediates protein-protein interactions. Overall, this study offers a robust and new understanding of Wag31 function.

      Weaknesses:

      The following major concerns should be addressed.

      • Authors use 10-N-Nonyl-acridine orange (NAO) as a marker for cardiolipin localization. However, given that NAO is known to bind to various anionic phospholipids, how do the authors know that what they are seeing is specifically visualizing cardiolipin and not a different anionic phospholipid? For example, phosphatidylinositol is another abundant anionic phospholipid in mycobacterial plasma membrane.

      We thank the reviewer for the comment. Despite its promiscuous binding to other anionic phospholipids, 10-N-Nonyl-acridine orange is widely used to stain Cardiolipin and determine its localisation in bacterial cells and mitochondria of eukaryotes (Garcia Fernandez et al., 2004; Mileykovskaya & Dowhan, 2000; Renner & Weibel, 2011). This is because it has a stronger affinity for Cardiolipin than other anionic phospholipids with the affinity constant being 2 × 10<sup>6</sup> M−<sup>1</sup> for Cardiolipin association and 7 × 10<sup>4</sup> M−<sup>1</sup> for that of phosphatidylserine and phosphatidylinositol association (Petit et al., 1992). Additionally, there is not yet another stain available for detecting Cardiolipin. Our proteinlipid binding assays suggest that Wag31 preferentially binds to Cardiolipin over other anionic phospholipids (Fig. 4b), hence it is likely that the majority of redistribution of NAO fluorescence that we observe might be contributed by Cardiolipin mislocalization due to altered Wag31 levels, with smaller degree of NAO redistribution intensity coming indirectly from other anionic phospholipids displaced from the membrane due to the loss of membrane integrity and cell shape changes due to Wag31.

      • Authors' data show that the N-terminal region of Wag31 is important for membrane tethering. The authors' data also show that the N-terminal region is important for sustaining mycobacterial morphology. However, the authors' statement in Line 256 "These results highlight the importance of tethering for sustaining mycobacterial morphology and survival" requires additional proof. It remains possible that the N-terminal region has another unknown activity, and this yet-unknown activity rather than the membrane tethering activity drives the morphological maintenance. Similarly, the N-terminal region is important for lipid homeostasis, but the statement in Line 270, "the maintenance of lipid homeostasis by Wag31 is a consequence of its tethering activity" requires additional proof. The authors should tone down these overstatements or provide additional data to support their claims.

      We agree with the reviewer that there exists a possibility for another function of the N-terminal that may contribute to sustaining mycobacterial physiology and survival. We would revise our statements in the paper to reflect the data. Results shown suggest that the tethering activity of the Nterminal region may contribute to mycobacterial morphology and survival. However, additional functions of this region can’t be ruled out. Similarly, the maintenance of lipid homeostasis by Wag31 may be associated with its tethering activity, although other mechanisms could also contribute to this process.

      • Authors suggest that Wag31 acts as a scaffold for the IMD (Fig. 8). However, Meniche et. al. has shown that MurG as well as GlfT2, two well-characterized IMD proteins, do not colocalize with Wag31 (DivIVA) (https://doi.org/10.1073/pnas.1402158111). IMD proteins are always slightly subpolar while Wag31 is located to the tip of the cell. Therefore, the authors' biochemical data cannot be easily reconciled with microscopic observations in the literature. This raises a question regarding the validity of protein-protein interaction shown in Figure 7. Since this pull-down assay was conducted by mixing E. coli lysate expressing Wag31 and Msm lysate expression Wag31 interactors like MurG, it is possible that the interactions are not direct. Authors should interpret their data more cautiously. If authors cannot provide additional data and sufficient justifications, they should avoid proposing a confusing model like Figure 8 that contradicts published observations.

      In the literature, MurG and GlfT2 have been shown to have polar localisation (Freeman et al., 2023; Hayashi et al., 2016; Kado et al., 2023) and two groups have shown slightly sub-polar localisation of MurG (García-Heredia et al., 2021; Meniche et al., 2014). Additionally, (Freeman et al., 2023) showed SepIVA to be a spatio-temporal regulator of MurG. MS/MS analysis of Wag31 immunoprecipitation data yielded both MurG and SepIVA to be interactors of Wag31 (Fig. 3). Given Wag31 also displays polar localisation, it is likely that it associates with the polar MurG. However, since a sub-polar localisation of MurG has also been reported, it is possible that they do not interact directly and another protein mediates their interaction. Based on the above, we will modify the model proposed in Fig. 8.

      We agree that for validation of interaction, we performed pulldown experiments by mixing E. coli lysates expressing His-Wag31 full-length or truncated protein with M. smegmatis lysates expressing FLAG-tagged interacting proteins. The wash conditions used were quite stringent for these pull-down assays—the wash buffer contained 1% Triton X100 that eliminates all non-specific and indirect interactions. However, we agree that we cannot conclusively state that the interactions are direct without purifying the proteins and performing the experiment. We will describe this caveat in the revised manuscript and propose a model that reflects the results we obtained.

      References:

      Freeman, A. H., Tembiwa, K., Brenner, J. R., Chase, M. R., Fortune, S. M., Morita, Y. S., & Boutte, C. C. (2023). Arginine methylation sites on SepIVA help balance elongation and septation in Mycobacterium smegmatis. Mol Microbiol, 119(2), 208-223. https://doi.org/10.1111/mmi.15006

      Garcia Fernandez, M. I., Ceccarelli, D., & Muscatello, U. (2004). Use of the fluorescent dye 10-N-nonyl acridine orange in quantitative and location assays of cardiolipin: a study on different experimental models. Anal Biochem, 328(2), 174-180. https://doi.org/10.1016/j.ab.2004.01.020

      García-Heredia, A., Kado, T., Sein, C. E., Puffal, J., Osman, S. H., Judd, J., Gray, T. A., Morita, Y. S., & Siegrist, M. S. (2021). Membrane-partitioned cell wall synthesis in mycobacteria. eLife, 10. https://doi.org/10.7554/eLife.60263

      Habibi Arejan, N., Ensinck, D., Diacovich, L., Patel, P. B., Quintanilla, S. Y., Emami Saleh, A., Gramajo, H., & Boutte, C. C. (2022). Polar protein Wag31 both activates and inhibits cell wall metabolism at the poles and septum. Front Microbiol, 13, 1085918. https://doi.org/10.3389/fmicb.2022.1085918

      Hayashi, J. M., Luo, C. Y., Mayfield, J. A., Hsu, T., Fukuda, T., Walfield, A. L., Giffen, S. R., Leszyk, J. D., Baer, C. E., Bennion, O. T., Madduri, A., Shaffer, S. A., Aldridge, B. B., Sassetti, C. M., Sandler, S. J., Kinoshita, T., Moody, D. B., & Morita, Y. S. (2016). Spatially distinct and metabolically active membrane domain in mycobacteria. Proc Natl Acad Sci U S A, 113(19), 5400-5405. https://doi.org/10.1073/pnas.1525165113

      Kado, T., Akbary, Z., Motooka, D., Sparks, I. L., Melzer, E. S., Nakamura, S., Rojas, E. R., Morita, Y. S., & Siegrist, M. S. (2023). A cell wall synthase accelerates plasma membrane partitioning in mycobacteria. eLife, 12, e81924. https://doi.org/10.7554/eLife.81924

      Meniche, X., Otten, R., Siegrist, M. S., Baer, C. E., Murphy, K. C., Bertozzi, C. R., & Sassetti, C. M. (2014). Subpolar addition of new cell wall is directed by DivIVA in mycobacteria. Proc Natl Acad Sci U S A, 111(31), E32433251. https://doi.org/10.1073/pnas.1402158111

      Mileykovskaya, E., & Dowhan, W. (2000). Visualization of phospholipid domains in Escherichia coli by using the cardiolipin-specific fluorescent dye 10-N-nonyl acridine orange. J Bacteriol, 182(4), 1172-1175. https://doi.org/10.1128/JB.182.4.1172-1175.2000

      Petit, J. M., Maftah, A., Ratinaud, M. H., & Julien, R. (1992). 10N-nonyl acridine orange interacts with cardiolipin and allows the quantification of this phospholipid in isolated mitochondria. Eur J Biochem, 209(1), 267273. https://doi.org/10.1111/j.1432-1033.1992.tb17285.x

      Renner, L. D., & Weibel, D. B. (2011). Cardiolipin microdomains localize to negatively curved regions of Escherichia coli membranes. Proc Natl Acad Sci U S A, 108(15), 6264-6269. https://doi.org/10.1073/pnas.1015757108

      Schägger, H. (2006). Tricine-SDS-PAGE. Nat Protoc, 1(1), 16-22. https://doi.org/10.1038/nprot.2006.4

      Xu, W. X., Zhang, L., Mai, J. T., Peng, R. C., Yang, E. Z., Peng, C., & Wang, H. H. (2014). The Wag31 protein interacts with AccA3 and coordinates cell wall lipid permeability and lipophilic drug resistance in Mycobacterium smegmatis. Biochem Biophys Res Commun, 448(3), 255-260. https://doi.org/10.1016/j.bbrc.2014.04.116

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      (1) Ln 130. A better clarification/discussion is required here. It is clear that both depletion and overexpression have an effect in levels of various lipids, but subsequent descriptions show that they affect different classes of lipids.

      We thank the reviewer for the comment. We have included a clarification for this in the discussion section.

      (2) The pulldown assays results are interesting, but the links are tentative.

      We thank the reviewer for the comment. The interactome of Wag31 was identified through the immunoprecipitation of Flag-tagged Wag31 complemented at an integrative locus in Wag31 mutant background to avoid overexpression artifacts. We used Msm::gfp expressing an integrative copy (at L5 locus) of FLAG-GFP as a control to subtract non-specific interactions. The experiment was performed in biological triplicates, and interactors that appeared in all replicates were selected for further analysis. Although we identified more than 100 interactors of Wag31, we analyzed only the top 25 hits, with a PSM cut-off 18 and unique peptides5. Additionally, two of Wag31's established interactors, AccD5 and Rne, were among the top five hits, thus validating our data.

      Though we agree that the interactions can either be direct or through a third partner, the fact that we obtained known interactors of Wag31 makes us believe these interactions are genuine. Moreover, for validation, we performed pulldown experiments by mixing E. coli lysates expressing HisWag31 full-length or truncated protein with M. smegmatis lysates expressing FLAG-tagged interacting proteins. The wash conditions used were quite stringent for these pull-down assays—the wash buffer contained 1% Triton X100 that eliminates all non-specific and indirect interactions. However, we agree that we cannot conclusively state that the interactions are direct without purifying the proteins and performing the experiment. We will describe this caveat in the revised manuscript.

      (3) The authors may perhaps like to rephrase claims of effects lipid homeostasis, as my understanding is that lipid localisation rather than catabolism/breakdown is affected.

      We thank the reviewer for the comment. In this manuscript, we are trying to convey that Wag31 is a spatiotemporal regulator of lipid metabolism. It is a peripheral protein that is hooked to the membrane via Cardiolipin and forms a scaffold at the poles, which helps localize several enzymes involved in lipid metabolism.

      Homeostasis is the process by which an organism maintains a steady-state of balance and stability in response to changes. Depletion of Wag31 not only results in delocalisation of lipids in intracellular lipid inclusions but also leads to changes in the levels of various lipid classes. Advancement in the field of spatial biology underscores the importance of native localization of various biological molecules crucial for maintaining a steady-cell of the cell. Hence, we have used the word “homeostasis” to describe both the changes observed in lipid metabolism.

      Reviewer #2 (Recommendations for the authors):

      I recommend the following experiments to strengthen the data presented:

      (1) Include a non-interacting FLAG-tagged protein as a negative control in the pull-down experiment to strengthen this data.

      We thank the reviewer for the comment. As suggested, we have included non-interacting FLAGtagged proteins as negative controls in the pulldown experiment. We chose MmpL4 and MmpS5 which were not found in the Wag31 interactome data. We performed pull-down experiments with both of them and included an interactor of Wag31 i.e. Msm2092 as a positive control. Fig. S3b revised shows E. coli lysate expressing His-Wag31 which was incubated with Msm lysates expressing either FLAG taggedMmpL4 or -MmpS5 or -Msm2092 (Fig. S3c revised). The mixed lysates were pulled down with Cobalt beads that bind to the His-tagged protein and analysed using Western blot analysis by probing with anti-FLAG antibody. The pull down experiments were performed independently twice, every time with Msm2092 as the positive control (Fig. S3d. revised).

      (2) Perform the pull-down experiments using only the Wag31 N-terminus to rule out any role that it may have in the protein-protein interactions.

      We thank the reviewer for the comment. To rule out the possibility of N-terminal of Wag31 in mediating protein-protein interactions, we cloned the N-terminal of Wag31 that comprises the DivIVAdomain in pET28b vector (Fig. 7a revised). Subsequently, the truncated protein, hereafter called Wag31<sub>∆C</sub> flanked by 6X His tags at both the termini was expressed in E. coli and subsequently mixed with Msm lysates expressing interactors of Wag31 (Fig. 7b-c revised). Earlier experiments with Wag31<sub>∆1-60</sub> or Wag31<sub>∆N</sub>  were performed with MurG, SepIVA, Msm2092 and AccA3 (Fig. 7 previous) so we used the same set of interactors to test our hypothesis. Briefly, His-Wag31<sub>∆C</sub>was mixed with Msm lysates expressing either FLAG-MurG, -SepIVA, -Msm2092 or -AccA3 and pull down experiments were performed as described previously. FLAG-MmpS5, a non-interactor of Wag31 was used as a negative control. As shown in Fig. 7d revised, His-Wag31 could bind to all the four interactors whereas His-Wag31<sub>∆C</sub> couldn’t, strengthening the conclusion that interactions of Wag31 with other proteins are mediated by its C-terminal. However, we can’t ignore the possibility of other proteins binding to the Nterminal of Wag31. Unfortunately, due to poor expression/instability of Wag31<sub>∆C</sub> in mycobacterial shuttle vectors, we couldn’t perform a global interactome analysis of Wag31<sub>∆C</sub>.

      Minor comments:

      - Please check the legend of Fig. 1g, it appears to be labelled incorrectly.

      We have checked it. It is correct. From Fig. 1g we are trying to reflect on the percentages of cells of the three strains i.e. Msm+ATc, Δwag31-ATc, and Δwag31+ATc displaying rod, round or bulged morphology.

      - For MS/MS analysis, a GFP control is mentioned but it is not indicated how this was incorporated in the data analysis. This information should be added.

      We have incorporated that in the revised methodology.

      - The information presented in Fig. 3a, e and f could be combined in one table.

      We appreciate the idea of the reviewer but we prefer a pictorial representation of the data. It allows readers to consume the information in parts, make quicker comparisons and understand trends easily.

      - Fig. 4c Wag31K20A appears smaller in size than the wild-type protein - why is this the case? Is this not a single amino acid substitution?

      Though K20A is a single amino acid substitution, it alters the mobility of Wag31 on SDS-PAGE gel. The sequence analysis of the plasmid expressing Wag31<sub>K20A</sub> doesn’t show additional mutations other than the desired K20A. The change in mobility could be due to a change in the conformation of Wag31<sub>K20A</sub> or its ability to bind to SDS or both that modify its mobility under the influence of electric field.

      - Please clarify what is contained in the first panel of fig 4e. compared to what is in the second panel.

      The first panel represents CL-Dil-Liposomes before incubation with Wag31-GFP and the second panel shows CL-Dil-Liposomes after incubation with Wag31-GFP. The third panel shows the mixture as observed in the green channel to investigate the localisation of Wag31-GFP in the liposome-protein mix. Fourth panel shows the merged of second and third.

      - The data in Fig 6d suggests higher levels of CL in the ∆wag31 compared to wild-type - how do the authors reconcile this with the MS data in Fig. 2g showing lower CL levels?

      Fig. 6d represents the distribution of CL localisation in the tested strains of mycobacteria whereas Fig. 2g shows the absolute levels of CL in various strains. We attribute greater confidence on the lipidomics data which suggests down regulation of CL species. The NAO staining and microscopy is merely for studying localization of the CL along the cell, and cannot be used to reliably quantify or equate it to CL levels. The staining using a probe such as NAO is dependent on factors such as hydrophobicity and permeability of the cell wall, which we expect to be severely altered in a Wag31 mutant. Therefore, the increased staining of NAO seen in Wag31 mutant could just be reflective of the increased uptake of the dye rather than absolute levels of CL. The specificity of staining and localization however can be expected to be unaltered.

      Reviewer #3 (Recommendations for the authors):

      Following are suggestions for improving the writing and presentation.

      • Figure 1, the meaning of the yellow arrows present in f and h should be mentioned in the figure legend.

      We have incorporated that in the revised legend. In Fig.1f, the yellow arrowhead represents the bulged pole morphology whereas in Fig. 1h, it indicates intracellular lipid inclusions.

      • Figure 7 legend refers to panels g, h, and i. However, Figure 7 only has panels a-c. The legend lacks a description of panel c.

      We have corrected the typos and the legend.

      • Figure S1, F2-R2 and F3-R3 expected sizes should be stated in the legend of the figure.

      We have updated the legends.

      • Figure S5, is this the same figure as 5e? If so, there is no need for this figure.

      We have removed Fig. S5.

      • Methods need to be written more carefully with enough details. I listed some of the concerns below.

      Detailed methodology was previously provided in the supplementary material and now we have moved it to the materials and methods in the revised manuscript.

      • Line 392, provide more details on western blotting. What is the secondary antibody? What image documentation system was used?

      We have updated the methodology.

      • Line 400, while the methods may be the same as the reference 64, authors should still provide key details such as the way samples were fixed and processed for SEM and TEM.

      We have provided a detailed description of the same in methodology in the revised version.

      • Line 437, how do authors calculate the concentration of liposome to be 10 µM? Do they possibly mean the concentration of phospholipids used to make the liposomes?

      Yes, this is the concentration of total lipids used to make liposomes. 1 μM of Wag31 or its mutants were mixed with 100 nm extruded liposomes containing 10 μm total lipid in separate Eppendorf tubes.

      • Supplemental Line 9, "turns of" should read "turns off".

      We have edited this.

      • Supplemental Line 13, define LHS and RHS.

      LHS or left hand sequence and RHS or right hand sequence refers to the upstream and downstream flanking regions of the gene of interest.

      • Supplemental Line 20, indicate the manufacturer of the microscope and type of the objective lens.

      We have added these details now.

      • Supplemental Line 31, define MeOH, or use a chemical formula like chloroform.

      MeOH is methanol. We have provided a chemical formula in the revised version.

      • Supplemental Line 53, indicate the concentration of trypsin.

      We have included that in the revised version.

      • Supplemental Line 72, g is not a unit. "30,000 g" should be "30,000x g".

      We have revised this in the manuscript.

      • Supplemental Line 114, provide more details on western blotting. What is the manufacturer of antiFLAG antibody? What is the secondary antibody? How was the antibody binding visualized? What image documentation system was used?

      We have provided these details in the revised version.

    1. eLife Assessment

      This important study reports a reanalysis of one experiment of a previously-published report to characterize the dynamics of neural population codes during visual working memory in the presence of distracting information. This paper presents solid evidence that working memory representations are dynamic and distinct from sensory representations of intervening distractions. This research will be of interest to cognitive neuroscientists working on the neural bases of visual perception and memory.

    2. Reviewer #1 (Public review):

      Summary:

      In this study, the authors re-analyzed a public dataset (Rademaker et al, 2019, Nature Neuroscience) which includes fMRI and behavioral data recorded while participants held an oriented grating in visual working memory (WM) and performed a delayed recall task at the end of an extended delay period. In that experiment, participants were pre-cued on each trial as to whether there would be a distracting visual stimulus presented during the delay period (filtered noise or randomly-oriented grating). In this manuscript, the authors focused on identifying whether the neural code in retinotopic cortex for remembered orientation was 'stable' over the delay period, such that the format of the code remained the same, or whether the code was dynamic, such that information was present, but encoded in an alternative format. They identify some timepoints - especially towards the beginning/end of the delay - where the multivariate activation pattern fails to generalize to other timepoints, and interpret this as evidence for a dynamic code. Additionally, the authors compare the representational format of remembered orientation in the presence vs absence of a distracting stimulus, averaged over the delay period. This analysis suggested a 'rotation' of the representational subspace between distracting orientations and remembered orientations, which may help preserve simultaneous representations of both remembered and viewed stimuli. Intriguingly, this rotation was a bit smaller for Expt 2, in which the orientation distractor had a greater behavioral impact on the participants' behavioral working memory recall performance, suggesting that more separation between subspaces is critical for preserving intact working memory representations.

      Strengths:

      (1) Direct comparisons of coding subspaces/manifolds between timepoints, task conditions, and experiments is an innovative and useful approach for understanding how neural representations are transformed to support cognition

      (2) Re-use of existing dataset substantially goes beyond the authors' previous findings by comparing geometry of representational spaces between conditions and timepoints, and by looking explicitly for dynamic neural representations

      (3) Simulations testing whether dynamic codes can be explained purely by changes in data SNR are an important contribution, as this rules out a category of explanations for the dynamic coding results observed

      Weaknesses:

      (1) Primary evidence for 'dynamic coding', especially in early visual cortex, appears to be related to the transition between encoding/maintenance and maintenance/recall, but the delay period representations seem overall stable, consistent with some previous findings. However, given the simulation results, the general result that representations may change in their format appears solid, though the contribution of different trial phases remains important for considering the overall result.

      (2) Converting a continuous decoding metric (angular error) to "% decoding accuracy" serves to obfuscate the units of the actual results. Decoding precision (e.g., sd of decoding error histogram) would be more interpretable and better related to both the previous study and behavioral measures of WM performance.

    3. Reviewer #2 (Public review):

      Summary:

      In this work, Degutis and colleagues addressed an interesting issue related to the concurrent coding of sensory percepts and visual working memory contents in visual cortices. They used generalization analyses to test whether working memory representations change over time, diverge from sensory percepts, and vary across distraction conditions. Temporal generalization analysis demonstrated that off-diagonal decoding accuracies were lower than on-diagonal decoding accuracies, regardless of the presence of intervening distractions, implying that working memory representations can change over time. They further showed that the coding space for working memory contents showed subtle but statistically significant changes over time, potentially explaining the impaired off-diagonal decoding performance. The neural coding of sensory distractions instead remained largely stable. Generalization analyses between target and distractor codes showed overlaps but were not identical. Cross-condition decodings had lower accuracies compared to within-condition decodings. Finally, within-condition decoding revealed more reliable working memory representations in the condition with intervening random noises compared to cross-condition decoding using a trained classifier on data from the no-distraction condition, indicating a change in the VWM format between the noise distractor and no-distractor trials.

      Strengths:

      This paper demonstrates a clever use of generalization analysis to show changes in the neural codes of working memory contents across time and distraction conditions. It provides some insights into the differences between representations of working memory and sensory percepts, and how they can potentially coexist in overlapping brain regions.

      Comments on revisions:

      I appreciate the authors' efforts in addressing my previous concerns. The inclusion of additional analyses and data has strengthened the paper. I have no further concerns.

    1. eLife Assessment

      This study describes an improved adaptive sampling approach, multiple-walker Supervised Molecular Dynamics (mwSuMD), and its application to G protein-coupled receptors (GPCRs), which are the most abundant membrane proteins and key targets for drug discovery. The manuscript provides solid evidence that the mwSuMD approach can assist in the sampling of complex binding processes, leading to useful findings for GPCR activity, including resolution of interactions not seen experimentally. The method has the potential to have broad applicability in structural biology and pharmacology.

    2. Reviewer #1 (Public review):

      Summary:

      The authors investigate ligand and protein-binding processes in GPCRs (including dimerization) by the multiple walker supervised molecular dynamics method. The paper is interesting and it is very well written.

      Strengths:

      The authors' method is a powerful tool to gain insight on the structural basis for the pharmacology of G protein-coupled receptors.

    3. Reviewer #2 (Public review):

      The study by Deganutti and co-workers is a methodological report on an adaptive sampling approach, multiple walker supervised molecular dynamics (mwSuMD), which represents an improved version of the previous SuMD.<br /> Case-studies concern complex conformational transitions in a number of G protein Coupled Receptors (GPCRs) involving long time-scale motions such as binding-unbinding and collective motions of domains or portions. GPCRs are specialized GEFs (guanine nucleotide exchange factors) of heterotrimeric Gα proteins of the Ras GTPase superfamily. They constitute the largest superfamily of membrane proteins and are of central biomedical relevance as privileged targets of currently marketed drugs.<br /> MwSuMD was exploited to address:

      a) binding and unbinding of the arginine-vasopressin (AVP) cyclic peptide agonist to the V2 vasopressin receptor (V2R);<br /> b) molecular recognition of the β2-adrenergic receptor (β2-AR) and heterotrimeric GDP-bound Gs protein;<br /> c) molecular recognition of the A1-adenosine receptor (A1R) and palmotoylated and geranylgeranylated membrane-anchored heterotrimeric GDP-bound Gi protein;<br /> d) the whole process of GDP release from membrane-anchored heterotrimeric Gs following interaction with the glucagon-like peptide 1 receptor (GLP1R), converted to the active state following interaction with the orthosteric non-peptide agonist danuglipron.

      The revised version has improved clarity and rigor compared to the original also thanks to the reduction in the number of complex case studies treated superficially.<br /> The mwSuMD method is solid and valuable, has wide applicability and is compatible with the most world-widely used MD engines. It may be of interest to the computational structural biology community.<br /> The huge amount of high-resolution data on GPCRs makes those systems suitable, although challenging, for method validation and development.<br /> While the approach is less energy-biased than other enhanced sampling methods, knowledge, at the atomic detail, of binding sites/interfaces and conformational states is needed to define the supervised metrics, the higher the resolution of such metrics is the more accurate the outcome is expected to be. Definition of the metrics is a user- and system-dependent process.

    4. Reviewer #3 (Public review):

      Summary:

      In the present work Deganutti et al. report a structural study on GPCR functional dynamics using a computational approach called supervised molecular dynamics.

      Strengths:

      The study has the potential to provide novel insight into GPCR functionality. An example is the interaction between D344 and R385 identified during the Gs coupling by GLP-1R. However, validation of the findings, even computationally through for instance in silico mutagenesis study, is advisable.

      Weaknesses:

      No significant advance of the existing structural data on GPCR and GPCR/G protein coupling is provided. Most of the results are reproductions of the previously reported structures.

    5. Author response:

      The following is the authors’ response to the previous reviews

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      The authors investigate ligand and protein-binding processes in GPCRs (including dimerization) by the multiple walker supervised molecular dynamics method. The paper is interesting and it is very well written.

      Strengths:

      The authors' method is a powerful tool to gain insight on the structural basis for the pharmacology of G protein-coupled receptors.

      We thank the Reviewer for the positive comment on the manuscript and the proposed methods.

      Reviewer #2 (Public review):

      The study by Deganutti and co-workers is a methodological report on an adaptive sampling approach, multiple walker supervised molecular dynamics (mwSuMD), which represents an improved version of the previous SuMD.

      Case-studies concern complex conformational transitions in a number of G protein Coupled Receptors (GPCRs) involving long time-scale motions such as binding-unbinding and collective motions of domains or portions. GPCRs are specialized GEFs (guanine nucleotide exchange factors) of heterotrimeric Gα proteins of the Ras GTPase superfamily. They constitute the largest superfamily of membrane proteins and are of central biomedical relevance as privileged targets of currently marketed drugs.

      MwSuMD was exploited to address:

      a) binding and unbinding of the arginine-vasopressin (AVP) cyclic peptide agonist to the V2 vasopressin receptor (V2R);

      b) molecular recognition of the β2-adrenergic receptor (β2-AR) and heterotrimeric GDPbound Gs protein;

      c) molecular recognition of the A1-adenosine receptor (A1R) and palmotoylated and geranylgeranylated membrane-anchored heterotrimeric GDP-bound Gi protein;

      d) the whole process of GDP release from membrane-anchored heterotrimeric Gs following interaction with the glucagon-like peptide 1 receptor (GLP1R), converted to the active state following interaction with the orthosteric non-peptide agonist danuglipron.

      The revised version has improved clarity and rigor compared to the original also thanks to the reduction in the number of complex case studies treated superficially.

      The mwSuMD method is solid and valuable, has wide applicability and is compatible with the most world-widely used MD engines. It may be of interest to the computational structural biology community.

      The huge amount of high-resolution data on GPCRs makes those systems suitable, although challenging, for method validation and development.

      While the approach is less energy-biased than other enhanced sampling methods, knowledge, at the atomic detail, of binding sites/interfaces and conformational states is needed to define the supervised metrics, the higher the resolution of such metrics is the more accurate the outcome is expected to be. Definition of the metrics is a user- and system-dependent process.

      We thank the Reviewer for the positive comment on the revised manuscript and mwSuMD. We agree that the choice of supervised metrics is user- and systemdependent. We aim to improve this aspect in the future with the aid of interpretable machine learning.

      Reviewer #3 (Public review):

      Summary:

      In the present work Deganutti et al. report a structural study on GPCR functional dynamics using a computational approach called supervised molecular dynamics.

      Strengths:

      The study has potential to provide novel insight into GPCR functionality. Example is the interaction between D344 and R385 identified during the Gs coupling by GLP-1R. However, validation of the findings, even computationally through for instance in silico mutagenesis study, is advisable.

      Weaknesses:

      No significant advance of the existing structural data on GPCR and GPCR/G protein coupling is provided. Most of the results are reproductions of the previously reported structures.

      The method focus of our study (mwSuMD) is an enhancement of the supervised molecular dynamics that allows supervising two metrics at the same time and uses a score, rather than a tabù-like algorithm, for handing the simulation. Further changes are the seeding of parallel short replicas (walkers) rather than a series of short simulations, and the software implementation on different MD engines (e.g. Acemd, OpenMM, NAMD, Gromacs).

      We agree with the Reviewer that experimental validation of the findings would be advisable, in line with any computational prediction. We are positive that future studies from our group employing mwSuMD will inform mutagenesis and BRET-based experiments.

      Reviewer #2 (Recommendations for the authors):

      As for GLP1R, I remain convinced that the 7LCI would have been better as a reference for all simulations than 7LCJ, also because 7LCI holds a slightly more complete ECD.

      We agree that 7LCJ would have been a better starting point than 7LCI for simulations because it presents the stalk region, contrary to 7LCJ. However, we do not think it might have influenced the output because the stalk is the most flexible segment of GLP1R, and any initial conformation is usually not retained during MD simulations.

      Please, correct everywhere the definition of the 6LN2 structure of GPL1R as a ligand-free or apo, because that structure is indeed bound to a negative allosteric modulator docked on the cytosolic end of helix-6

      We thank the reviewer for this precision. The text has been modified accordingly.

      As for the beta2-AR, the "full-length" AlphaFold model downloaded from the GPCRdb is not an intermediate active state because it is very similar to the receptor in the 3SN6 complex with Gs. Please, eliminate the inappropriate and speculative adjective "intermediate".

      We have changed “intermediate” to “not fully active”, which is less speculative since full activation can be achieved only in the presence of the G protein.

      Incidentally, in that model, the C-tail, eliminated by the authors, is completely wrong and occupies the G protein binding site. It is not clear to me the reason why the authors preferred to used an AlphaFold model as an input of simulations rather than a high resolution structural model, e.g. 4LDO. Perhaps, the reason is that all ICL regions, including ICL3, were modeled by AlphaFold even if with low confidence. I disagree with that choice.

      We understand the reviewer’s point of view. Should we have simulated an “equilibrium” receptor-ligand complex, we would have made the same choice. However, the conformational changes occurring during a G protein binding are so consistent that the starting conformation of the receptor becomes almost irrelevant as long as a sensate structure is used.  

      Reviewer #3 (Recommendations for the authors):

      The revised version of the manuscript is more concise, focusing only on two systems. However, the authors have responded superficially to the reviewers' comments, merely deleting sections of text, making minor corrections, or adding small additions to the text. In particular, the authors have not addressed the main critical points raised by both Reviewer 2 and Reviewer 3. 

      For example, the RMSD values for the binding of PF06882961 to GLP-1R remain high, raising doubts about the predictive capabilities of the method, at least for this type of system.

      What is the RMSD of the ligand relative to the experimental pose obtained in the simulations? This value must be included in the text.

      We have added this piece of information about PF06882961 RMSD in the text, which on page 6 now reads “We simulated the binding of PF06882961, reaching an RMSD to its bound conformation in 7LCJ of 3.79 +- 0.83 Å (computed on the second half of the merged trajectory, superimposing on GLP-1R Ca atoms of TMD residues 150 to 390), using multistep supervision on different system metrics (Figure 2) to model the structural hallmark of GLP-1R activation (Video S5, Video S6).”

      Similarly, the activation mechanism of GLP-1R is only partially simulated.

      Furthermore, it is not particularly meaningful to justify the high RMSD values of the SuMD simulations for the binding of Gs to GLP-1R by comparing them with those reported under unbiased MD conditions. "Replica 2, in particular, well reproduced the cryo-EM GLP-1R complex as suggested by RMSDs to 7LCI of 7.59{plus minus}1.58Å, 12.15{plus minus}2.13Å, and 13.73{plus minus}2.24Å for Gα, Gβ, and Gγ respectively. Such values are not far from the RMSDs measured in our previous simulations of GLP-1R in complex with Gs and GLP-149 (Gα = 6.18 {plus minus} 2.40 Å; Gβ = 7.22 {plus minus} 3.12 Å; Gγ = 9.30 {plus minus} 3.65 Å), which indicates overall higher flexibility of Gβ and Gγ compared to Gα, which acts as a sort of fulcrum bound to GLP-1R."

      Without delving into the accuracy of the various calculations, the authors should acknowledge that comparing protein structures with such high RMSD values has no meaningful significance in terms of convergence toward the same three-dimensional structure.

      The text has been edited to accommodate the reviewer’s suggestion and still give the readers the measure of the high flexibility of Gs bound to GLP-1R. It now reads “Such values do not support convergence with the static experimental structure but are not far from the RMSDs measured in our previous simulations of GLP-1R in complex with G<sub>s</sub> and GLP-1 (G<sub>α</sub> = 6.18 ± 2.40 Å; G<sub>b</sub> = 7.22 ± 3.12 Å; G<sub>g</sub> = 9.30 ± 3.65 Å), which indicates overall higher flexibility of G<sub>b</sub> and G<sub>g</sub> compared to G<sub>α</sub>, which acts as a sort of fulcrum bound to GLP-1R.”

      Have the authors simulated the binding of the Gs protein using the experimentally active structure of GLP-1R in complex with the ligand PF06882961 (PDB ID 7LCJ)? Such a simulation would be useful to assess the quality of the binding simulation of Gs to the GLP1R/PF06882961 complex obtained from the previous SuMD.

      We considered performing the Gs binding simulation to the active structure of GLP-1R.

      However, the GLP-1R (and other class B receptors) fully active state, as reported in 7LCJ, depends on the presence of the Gs and can be reached only upon effector coupling. Since it is unlikely that the unbound receptor is already in the fully active state, we reasoned that considering it as a starting point for Gs binding simulations would have been an artifact.

      An example of the insufficient depth of the authors' replies can be seen in their response: "We note that among the suggested references, only Mafi et al report about a simulated G protein (in a pre-formed complex) and none of the work sampled TM6 rotation without input of energy."

      This statement is inaccurate. For instance, D'Amore et al. (Chem 2024, doi: 10.1016/j.chempr.2024.08.004) simulated Gs coupling to A2A as well as TM6 rotation, as did Maria-Solano and Choi (eLife 2023, doi: 10.7554/eLife.90773.1). The former employed path collective variables metadynamics, which is not cited in the introduction or the discussion, despite its relevance to the methodologies mentioned.

      Respectfully, our previous reply is correct, as all of the mentioned articles used enhanced (energy-biased) approaches, so the claim “none of the work sampled TM6 rotation without input of energy” stands. The reference to D’Amore et al. (published after the previous round of reviews of this manuscript) has been added to the introduction; we thank the reviewer for pointing it out. 

      Additionally, SuMD employs a tabu algorithm that applies geometric supervision to the simulation, serving as an alternative approach to enhancing sampling compared to the "input of energy" techniques as called by the authors. A fair discussion should clearly acknowledge this aspect of the SuMD methodology.

      We have now specified in the Methods that a tabù-like algorithm is part of SuMD, which, despite being the parent technique of mwSuMD, is not the focus of the present work. We provide extended references for readers interested in SuMD. mwSuMD, on the other hand, does not use a tabù-like algorithm but rather a continuative approach based on a score to select the best walker for each batch, as described in the Methods.

    1. eLife Assessment

      In Plasmodium male gametocytes, rapid nuclear division occurs with an intact nuclear envelope, requiring precise coordination between nuclear and cytoplasmic events to ensure proper packaging of each nucleus into a developing gamete. This valuable study characterizes two proteins involved in the formation of Plasmodium berghei male gametes. By integrating live-cell imaging, ultrastructural expansion microscopy, and proteomics, this study convincingly identifies SUN1 and its interaction partner ALLAN as crucial nuclear envelope components in male gametogenesis. A role for SUN1 in membrane dynamics and lipid metabolism is less well supported. The results are of interest for general cell biologists working on unusual mitosis pathways.

      [Editors' note: this paper was reviewed by Review Commons.]

    2. Reviewer #1 (Public review):

      Summary:

      Activated male Plasmodium gametocytes undergo very rapid nuclear division, while keeping the nuclear envelope intact. There is interest in how events inside the nucleus are co-ordinated with events in the parasite cytoplasm, to ensure that each nucleus is packaged into a nascent male gamete.

      This manuscript by Zeeshan et al describes the organisation of a nuclear membrane bridging protein, SUN1, during nuclear division. SUN1 is expected from studies in other organisms to be a component of a bridging complex (LINC) that connects the inner nuclear membrane to the outer nuclear membrane, and from there to the cytoplasmic microtubule-organising centres, the centrosome and the basal body.

      The authors show that knockout of the SUN1 in gametocytes leads to severe disruption of the mitotic spindle and failure of the basal bodies to segregate. The authors show convincingly that functional SUN1 is required for male gamete formation and subsequent oocyst development.

      The authors identified several SUN1-interacting proteins, thus providing information about the nuclear membrane bridging machinery.

      Strengths:

      The authors have used state of the art imaging, genetic manipulation and immunoprecipitation approaches.

      Weaknesses:

      Technical limitations of some of the methods used make it difficult to interpret some of the micrographs.

      From studies in other organisms, a protein called KASH is a critical component the bridging complex (LINC). That is, KASH links SUN1 to the outer nuclear membrane. The authors undertook a gene sequence analysis that reveals that Plasmodium lacks a KASH homologue. Thus, further work is needed to identify the functional equivalent of KASH, to understand bridging machinery in Plasmodium.

      Comments on revised version:

      The authors have addressed the comments and suggestions that I provided as part of a Review Commons assessment.

    3. Reviewer #2 (Public review):

      Zeeshan et al. investigate the function of the protein SUN1, a proposed nuclear envelope protein linking nuclear and cytoplasmic cytoskeleton, during the rapid male gametogenesis of the rodent malaria parasite Plasmodium berghei. They reveal that SUN1 localises to the nuclear envelope (NE) in male and female gametes and show that the male NE has unexpectedly high dynamics during the rapid process of gametogenesis. Using expansion microscopy, the authors find that SUN1 is enriched at the neck of the bipartite MTOC that links the intranuclear spindle to the basal bodies of the cytoplasmic axonemes. Upon deletion of SUN1, the basal bodies of the eight axonemes fail to segregate, no spindle is formed, and emerging gametes are anucleated, leading to a complete block in transmission. By interactomics the authors identify a divergent allantoicase-like protein, ALLAN, as a main interaction partner of SUN1 and further show that ALLAN deletion largely phenocopies the effect of SUN1.

      Overall, the authors use an extensive array of fluorescence and electron microscopy techniques as well as interactomics to convincingly demonstrate that SUN1 and ALLAN play a role in maintaining the structural integrity of the bipartite MTOC during the rapid rounds of endomitosis in male gametogenesis.

      Two suggestions for improvement of the work remain:

      (1) Lipidomic analysis of WT and SUN1-knockout gametocytes before and after activation resulted in only minor changes in some lipid species. Without statistical analysis, it remains unclear if these changes are statistically significant and not rather due to expected biological variability. While the authors clearly toned down their conclusions in the revised manuscript, some phrasings in the results and the discussion still suggest that gametocyte activation and/or SUN1-knockout affects lipid composition. Similarly, some phrases suggest that SUN1 is responsible for the observed loops and folds in the NE and that SUN1 KO affects the NE dynamics. Currently, I do not think that the data supports these statements.

      (2) It is interesting to note that ALLAN has a much more specific localisation to basal bodies than SUN1, which is located to the entire nuclear envelope. Knock out of ALLAN also exhibits a milder (but still striking) phenotype than knockout of SUN1. These observations suggest that SUN1 has additional roles in male gametogenesis besides its interaction with ALLAN, which could be discussed a bit more.

      This study uses extensive microscopy and genetics to characterise an unusual SUN1-ALLAN complex, thus providing new insights into the molecular events during Plasmodium male gametogenesis, especially how the intranuclear events (spindle formation and mitosis) are linked to the cytoplasmic separation of the axonemes. The characterisation of the mutants reveals an interesting phenotype, showing that SUN1 and ALLAN are localised to and maintain the neck region of the bipartite MTOC. The authors here confirm and expand the previous knowledge about SUN1 in P. berghei, adding more detail to its localisation and dynamics, and further characterise the interaction partner ALLAN. Given the evolutionary divergence of Plasmodium, these results are interesting not only for parasitologists, but also for more general cell biologists.

    1. eLife Assessment

      In this valuable study, the authors used rats to determine the receptor for a food-related perception that has been characterized in humans. The data are solid in terms of methods and analysis: the data show that this stimulus (ornithine) has some additive effects in terms of increasing preference and taste response in rats when it is mixed with other more common taste stimuli. Therefore, the combinations of experiments generally support (but do not conclusively prove) the hypothesis that the "kokumi" taste effect elicited by this stimulus in humans may be mediated by the specific receptor examined in the study.

    2. Reviewer #1 (Public review):

      Summary:

      This paper contains what could be described as a "classic" approach towards evaluating a novel taste stimuli in an animal model, including standard behavioral tests (some with nerve transections), taste nerve physiology, and immunocytochemistry of taste cells of the tongue. The stimulus being tested is ornithine, from a class of stimuli called "kokumi" (in terms of human taste); these kokumi stimuli appear to enhance other canonical tastes, increasing what are essentially hedonic attributes of other stimuli. The mechanism for ornithine detection is thought to be GPRC6A receptors expressed in taste cells. The authors showed evidence for this in an earlier paper with mice; this paper evaluates ornithine taste in a rat model, and comes to a similar conclusion, albeit with some small differences between the two rodent species.

      Strengths:

      The data show effects of ornithine on taste/intake in laboratory rats: In two-bottle and briefer intake tests, adding ornithine results in higher intake of most, but all not all stimuli tested. Bilateral chorda tympani (CT) nerve cuts or the addition of GPRC6A antagonists decreased or eliminated these effects. Ornithine also evoked responses by itself in the CT nerve, but mainly at higher concentrations; at lower concentrations it potentiated the response to monosodium glutamate. Finally, immunocytochemistry of taste cell expression indicated that GPRC6A was expressed predominantly in the anterior tongue, and co-localized (to a small extent) with only IP3R3, indicative of expression in a subset of type II taste receptor cells.

      Weaknesses:

      As the authors are aware, it is difficult to assess a complex human taste with complex attributes, such as kokumi, in an animal model. In these experiments they attempt to uncover mechanistic insights about how ornithine potentiates other stimuli by using a variety of established experimental approaches in rats. They partially succeed by finding evidence that GPRC6A may mediate effects of ornithine when it is used at lower concentrations. In the revisions they have scaled back their interpretations accordingly. A supplementary experiment measuring certain aspects of the effects of ornithine added to Miso soup in human subjects is included for the express purpose of establishing that the kokumi sensation of a complex solution is enhanced by ornithine. This (supplementary) experiment was conducted with a small sample size, and though perhaps useful, these preliminary results do not align particularly well with the animal experiments. It would be helpful to further explore human taste of ornithine in a larger and better-controlled study.

    3. Reviewer #2 (Public review):

      Summary:

      The authors used rats to determine the receptor for a food-related perception (kokumi) that has been characterized in humans. They employ a combination of behavioral, electrophysiological, and immunohistochemical results to support their conclusion that ornithine-mediated kokumi effects are mediated by the GPRC6A receptor. They complemented the rat data with some human psychophysical data. I find the results intriguing, but believe that the authors overinterpret their data.

      Strengths:

      The authors provide compelling evidence that ornithine enhances the palatability of several chemical stimuli (i.e., IMP, MSG, MPG, Intralipos, sucrose, NaCl, quinine). Ornithine also increases CT nerve responses to MSG. Additionally, the authors provide evidence that the effects of ornithine are mediated by GPRC6A, a G-protein-coupled receptor family C group 6 subtype A, and that this receptor is expressed primarily in fungiform taste buds. Taken together, these results indicate that ornithine enhances the palatability of multiple taste stimuli in rats, and that the enhancement is mediated, at least in part, within fungiform taste buds. This finding could stand on its own. The question of whether ornithine produces these effects by eliciting kokumi-like perceptions (see below) should be presented as speculation in the Discussion section.

      Weaknesses:

      I am still unconvinced that the measurements in rats reflect the "kokumi" taste percept described in humans. The authors conducted long-term preference tests, 10-min avidity tests and whole chorda tympani (CT) nerve recordings. None of these procedures specifically model features of "kokumi" perception in humans, which (according to the authors) include increasing "intensity of whole complex tastes (rich flavor with complex tastes), mouthfulness (spread of taste and flavor throughout the oral cavity), and persistence of taste (lingering flavor)." While it may be possible to develop behavioral assays in rats (or mice) that effectively model kokumi taste perception in humans, the authors have not made any effort to do so. As a result, I do not think that the rat data provide support for the main conclusion of the study--that "ornithine is a kokumi substance and GPRC6A is a novel kokumi receptor."

      Why are the authors hypothesizing that the primary impacts of ornithine are on the peripheral taste system? While the CT recordings provide support for peripheral taste enhancement, they do not rule out the possibility of additional central enhancement. Indeed, based on the definition of human kokumi described above, it is likely that the effects of kokumi stimuli in humans are mediated at least in part by the central flavor system.

      The authors include (in the supplemental data section) a pilot study that examined the impact of ornithine on variety of subjective measures of flavor perception in humans. The presence of this pilot study within the larger rat study does not really make sense. If the human studies are so important, as the authors state, then why did the authors relegate them to the supplemental data section? Usually one places background and negative findings in this section of a paper. Accordingly, I recommend that the human data be published in a separate article.

    4. Reviewer #3 (Public review):

      Summary:

      In this study the authors set out to investigate whether GPRC6A mediates kokumi taste initiated by the amino acid L-ornithine. They used Wistar rats, a standard laboratory strain, as the primary model and also performed an informative taste test in humans, in which miso soup was supplemented with various concentrations of L-ornithine. The findings are valuable and overall the evidence is solid. L-Ornithine should be considered to be a useful test substance in future studies of kokumi taste and the class C G protein coupled receptor known as GPRC6A (C6A) along with its homolog, the calcium-sensing receptor (CaSR) should be considered candidate mediators of kokumi taste. The researchers confirmed in rats their previous work on Ornithine and C6A in mice (Mizuta et al Nutrients 2021).

      Strengths:

      The overall experimental design is solid based on two bottle preference tests in rats. After determining the optimal concentration for L-Ornithine (1 mM) in the presence of MSG, it was added to various tastants including: inosine 5'-monophosphate; monosodium glutamate (MSG); mono-potassium glutamate (MPG); intralipos (a soybean oil emulsion); sucrose; sodium chloride (NaCl; salt); citric acid (sour) and quinine hydrochloride (bitter). Robust effects of ornithine were observed in the cases of IMP, MSG, MPG and sucrose; and little or no effects were observed in the cases of sodium chloride, citric acid; quinine HCl. The researchers then focused on the preference for Ornithine-containing MSG solutions. Inclusion of the C6A inhibitors Calindol (0.3 mM but not 0.06 mM) or the gallate derivative EGCG (0.1 mM but not 0.03 mM) eliminated the preference for solutions that contained Ornithine in addition to MSG. The researchers next performed transections of the chord tympani nerves (with sham operation controls) in anesthetized rats to identify a role of the chorda tympani branches of the facial nerves (cranial nerve VII) in the preference for Ornithine-containing MSG solutions. This finding implicates the anterior half-two thirds of the tongue in ornithine-induced kokumi taste. They then used electrical recordings from intact chorda tympani nerves in anesthetized rats to demonstrate that ornithine enhanced MSG-induced responses following the application of tastants to the anterior surface of the tongue. They went on to show that this enhanced response was insensitive to amiloride, selected to inhibit 'salt tastant' responses mediated by the epithelial Na+ channel, but eliminated by Calindol. Finally they performed immunohistochemistry on sections of rat tongue demonstrating C6A positive spindle-shaped cells in fungiform papillae that partially overlapped in its distribution with the IP3 type-3 receptor, used as a marker of Type-II cells, but not with (i) gustducin, the G protein partner of Tas1 receptors (T1Rs), used as a marker of a subset of type-II cells; or (ii) 5-HT (serotonin) and Synaptosome-associated protein 25 kDa (SNAP-25) used as markers of Type-III cells.

      At least two other receptors in addition to C6A might mediate taste responses to ornithine: (i) the CaSR, which binds and responds to multiple L-amino acids (Conigrave et al, PNAS 2000), and which has been previously reported to mediate kokumi taste (Ohsu et al., JBC 2010) as well as responses to Ornithine (Shin et al., Cell Signaling 2020); and (ii) T1R1/T1R3 heterodimers which also respond to L-amino acids and exhibit enhanced responses to IMP (Nelson et al., Nature 2001). These alternatives are appropriately discussed and, taken together, the experimental results favor the authors' interpretation that C6A mediates the Ornithine responses. The authors provide preliminary data in Suppl. 3 for the possibility of co-expression of C6A with the CaSR.

      In the Discussion, the authors consider the potential effects of kokumi substances on the threshold concentrations of key tastants such as glutamate, arguing that extension of taste distribution to additional areas of the mouth (previously referred to as 'mouthfulness') and persistence of taste/flavor responses (previously referred to as 'continuity') could arise from a reduction in the threshold concentrations of umami and other substances that evoke taste responses. This concept may help to design future experiments.

      Weaknesses:

      The authors point out that animal models pose some difficulties of interpretation in studies of taste and raise the possibility in the Discussion that umami substances may enhance the taste response to ornithine (Line 271, Page 9).

      The status of one of the compounds used as an inhibitor of C6A, the gallate derivative EGCG, as a potential inhibitor of the CaSR or T1R1/T1R3 is unknown. It would have been helpful to show that a specific inhibitor of the CaSR failed to block the ornithine response.

      It would have been helpful to include a positive control kokumi substance in the two bottle preference experiment (e.g., one of the known gamma glutamyl peptides such as gamma-glu-Val-Gly or glutathione), to compare the relative potencies of the control kokumi compound and Ornithine, and to compare the sensitivities of the two responses to C6A and CaSR inhibitors.

    5. Author response:

      The following is the authors’ response to the previous reviews

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      This paper contains what could be described as a "classic" approach towards evaluating a novel taste stimuli in an animal model, including standard behavioral tests (some with nerve transections), taste nerve physiology, and immunocytochemistry of taste cells of the tongue. The stimulus being tested is ornithine, from a class of stimuli called "kokumi" (in terms of human taste); these kokumi stimuli appear to enhance other canonical tastes, increasing what are essentially hedonic attributes of other stimuli. The mechanism for ornithine detection is thought to be GPRC6A receptors expressed in taste cells. The authors showed evidence for this in an earlier paper with mice; this paper evaluates ornithine taste in a rat model, and comes to a similar conclusion, albeit with some small differences between the two rodent species.

      Strengths:

      The data show effects of ornithine on taste/intake in laboratory rats: In two-bottle and briefer intake tests, adding ornithine results in higher intake of most, but all not all stimuli tested. Bilateral chorda tympani (CT) nerve cuts or the addition of GPRC6A antagonists decreased or eliminated these effects. Ornithine also evoked responses by itself in the CT nerve, but mainly at higher concentrations; at lower concentrations it potentiated the response to monosodium glutamate. Finally, immunocytochemistry of taste cell expression indicated that GPRC6A was expressed predominantly in the anterior tongue, and co-localized (to a small extent) with only IP3R3, indicative of expression in a subset of type II taste receptor cells.

      Weaknesses:

      As the authors are aware, it is difficult to assess a complex human taste with complex attributes, such as kokumi, in an animal model. In these experiments they attempt to uncover mechanistic insights about how ornithine potentiates other stimuli by using a variety of established experimental approaches in rats. They partially succeed by finding evidence that GPRC6A may mediate effects of ornithine when it is used at lower concentrations. In the revision they have scaled back their interpretations accordingly. A supplementary experiment measuring certain aspects of the effects of ornithine added to Miso soup in human subjects is included for the express purpose of establishing that the kokumi sensation of a complex solution is enhanced by ornithine; however, they do not use any such complex solutions in the rat studies. Moreover, the sample size of the human experiment is (still) small - it really doesn't belong in the same manuscript with the rat studies.

      Despite the reviewer’s suggestion, we would like to include the human sensory experiment. Our rationale is that we must first demonstrate that the kokumi of miso soup is enhanced by the addition of ornithine, which is then followed by basic animal experiments to investigate the underlying mechanisms of kokumi in humans.

      We did not present the additive effects of ornithine on miso soup in the present rat study because our previous companion paper (Fig. 1B in Mizuta et al., 2021, Ref. #26) already confirmed that miso soup supplemented with 3 mM L-ornithine (but not D-ornithine) was statistically significantly (P < 0.001) preferred to plain miso soup by mice.

      Furthermore, we believe that our sample size (n = 22) is comparable to those employed in other studies. For example, the representative kokumi studies by Ohsu et al. (Ref. #9), Ueda et al. (Ref. #10), Shibata et al. (Ref. #20), Dunkel et al. (Ref. #37), and Yang et al. (Ref. #44) used sample sizes of 20, 19, 17, 9, and 15, respectively.

      Reviewer #2 (Public review):

      Summary:

      The authors used rats to determine the receptor for a food-related perception (kokumi) that has been characterized in humans. They employ a combination of behavioral, electrophysiological, and immunohistochemical results to support their conclusion that ornithine-mediated kokumi effects are mediated by the GPRC6A receptor. They complemented the rat data with some human psychophysical data. I find the results intriguing, but believe that the authors overinterpret their data.

      Strengths:

      The authors provide compelling evidence that ornithine enhances the palatability of several chemical stimuli (i.e., IMP, MSG, MPG, Intralipos, sucrose, NaCl, quinine). Ornithine also increases CT nerve responses to MSG. Additionally, the authors provide evidence that the effects of ornithine are mediated by GPRC6A, a G-protein-coupled receptor family C group 6 subtype A, and that this receptor is expressed primarily in fungiform taste buds. Taken together, these results indicate that ornithine enhances the palatability of multiple taste stimuli in rats and that the enhancement is mediated, at least in part, within fungiform taste buds. This is an important finding that could stand on its own. The question of whether ornithine produces these effects by eliciting kokumi-like perceptions (see below) should be presented as speculation in the Discussion section.

      Weaknesses:

      I am still unconvinced that the measurements in rats reflect the "kokumi" taste percept described in humans. The authors conducted long-term preference tests, 10-min avidity tests and whole chorda tympani (CT) nerve recordings. None of these procedures specifically model features of "kokumi" perception in humans, which (according to the authors) include increasing "intensity of whole complex tastes (rich flavor with complex tastes), mouthfulness (spread of taste and flavor throughout the oral cavity), and persistence of taste (lingering flavor)." While it may be possible to develop behavioral assays in rats (or mice) that effectively model kokumi taste perception in humans, the authors have not made any effort to do so. As a result, I do not think that the rat data provide support for the main conclusion of the study--that "ornithine is a kokumi substance and GPRC6A is a novel kokumi receptor."

      Kokumi can be assessed in humans, as demonstrated by the enhanced kokumi perception observed when miso soup is supplemented with ornithine (Fig. S1). Currently, we do not have a method to measure the same kokumi perception in animals. However, in the two-bottle preference test, our previous companion paper (Fig. 1B in Mizuta et al. 2021, Ref. #26) confirmed that miso soup supplemented with 3 mM L-ornithine (but not D-ornithine) was statistically significantly (P < 0.001) preferred over plain miso soup by mice.

      Of the three attributes of kokumi perception in humans, the “intensity of whole complex tastes (rich flavor with complex tastes)” was partly demonstrated in the present rat study. In contrast, “mouthfulness (the spread of taste and flavor throughout the oral cavity)” could not be directly detected in animals and had to be inferred in the Discussion. “Persistence of taste (lingering flavor)” was evident at least in the chorda tympani responses; however, because the tongue was rinsed 30 seconds after the onset of stimulation, the duration of the response was not fully recorded.

      It is well accepted in sensory physiology that the stronger the stimulus, the larger the tonic response—and consequently, the longer it takes for the response to return to baseline. For example, Kawasaki et al. (2016, Ref. #45) clearly showed that the duration of sensation increased proportionally with the concentration of MSG, lactic acid, and NaCl in human sensory tests. The essence of this explanation has been incorporated into the Discussion (p. 12).

      Why are the authors hypothesizing that the primary impacts of ornithine are on the peripheral taste system? While the CT recordings provide support for peripheral taste enhancement, they do not rule out the possibility of additional central enhancement. Indeed, based on the definition of human kokumi described above, it is likely that the effects of kokumi stimuli in humans are mediated at least in part by the central flavor system.

      We agree with the reviewer’s comment. Our CT recordings indicate that the effects of kokumi stimuli on taste enhancement occur primarily at the peripheral taste organs. The resulting sensory signals are then transmitted to the brain, where they are processed by the central gustatory and flavor systems, ultimately giving rise to kokumi attributes. This central involvement in kokumi perception is discussed on page 12. Although kokumi substances exert their effects at low concentrations—levels at which the substance itself (e.g., ornithine) does not become more favorable or (in the case of γ-Glu-Val-Gly) exhibits no distinct taste—we cannot rule out the possibility that even faint taste signals from these substances are transmitted to the brain and interact with other taste modalities.

      The authors include (in the supplemental data section) a pilot study that examined the impact of ornithine on variety of subjective measures of flavor perception in humans. The presence of this pilot study within the larger rat study does not really mice sense. While I agree with the authors that there is value in conducting parallel tests in both humans and rodents, I think that this can only be done effectively when the measurements in both species are the same. For this reason, I recommend that the human data be published in a separate article.

      Despite the reviewer’s suggestion, we intend to include the human sensory experiment. Our rationale is that we must first demonstrate that the kokumi of miso soup is enhanced by the addition of ornithine, and then follow up with basic animal experiments to investigate the potential underlying mechanisms of kokumi in humans.

      In our previous companion paper (Fig. 1B in Mizuta et al., 2021, Ref. #26), we confirmed with statistical significance (P < 0.001) that mice preferred miso soup supplemented with 3 mM L-ornithine (but not D-ornithine) over plain miso soup. However, as explained in our response to Reviewer #2’s first concern (in the Public review), it is difficult to measure two of the three kokumi attributes—aside from the “intensity of whole complex tastes (rich flavor with complex tastes)”—in animal models.

      The authors indicated on several occasions (e.g., see Abstract) that ornithine produced "synergistic" effects on the CT nerve response to chemical stimuli. "Synergy" is used to describe a situation where two stimuli produce an effect that is greater than the sum of the response to each stimulus alone (i.e., 2 + 2 = 5). As far as I can tell, the CT recordings in Fig. 3 do not reflect a synergism.

      We appreciate your comments regarding the definition of synergy. In Fig. 5 (not Fig. 3), please note the difference in the scaling of the ordinate between Fig. 5D (ornithine responses) and Fig. 5E (MSG responses). When both responses are presented on the same scale, it becomes evident that the response to 1 mM ornithine is negligibly small compared to the MSG response, which clearly indicates that the response to the mixture of MSG and 1 mM ornithine exceeds the sum of the individual responses to MSG and 1 mM ornithine. Therefore, we have described the effect as “synergistic” rather than “additive.” The same observation applies to the mice experiments in our previous companion paper (Fig. 8 in Mizuta et al. 2021, Ref. #26), where synergistic effects are similarly demonstrated by graphical representation. We have also added the following sentence to the legend of Fig. 5:

      “Note the different scaling of the ordinate in (D) and (E).”

      Reviewer #3 (Public review):

      Summary:

      In this study the authors set out to investigate whether GPRC6A mediates kokumi taste initiated by the amino acid L-ornithine. They used Wistar rats, a standard laboratory strain, as the primary model and also performed an informative taste test in humans, in which miso soup was supplemented with various concentrations of L-ornithine. The findings are valuable and overall the evidence is solid. L-Ornithine should be considered to be a useful test substance in future studies of kokumi taste and the class C G protein coupled receptor known as GPRC6A (C6A) along with its homolog, the calcium-sensing receptor (CaSR) should be considered candidate mediators of kokumi taste. The researchers confirmed in rats their previous work on Ornithine and C6A in mice (Mizuta et al Nutrients 2021).

      Strengths:

      The overall experimental design is solid based on two bottle preference tests in rats. After determining the optimal concentration for L-Ornithine (1 mM) in the presence of MSG, it was added to various tastants including: inosine 5'-monophosphate; monosodium glutamate (MSG); mono-potassium glutamate (MPG); intralipos (a soybean oil emulsion); sucrose; sodium chloride (NaCl; salt); citric acid (sour) and quinine hydrochloride (bitter). Robust effects of ornithine were observed in the cases of IMP, MSG, MPG and sucrose; and little or no effects were observed in the cases of sodium chloride, citric acid; quinine HCl. The researchers then focused on the preference for Ornithine-containing MSG solutions. Inclusion of the C6A inhibitors Calindol (0.3 mM but not 0.06 mM) or the gallate derivative EGCG (0.1 mM but not 0.03 mM) eliminated the preference for solutions that contained Ornithine in addition to MSG. The researchers next performed transections of the chord tympani nerves (with sham operation controls) in anesthetized rats to identify a role of the chorda tympani branches of the facial nerves (cranial nerve VII) in the preference for Ornithine-containing MSG solutions. This finding implicates the anterior half-two thirds of the tongue in ornithine-induced kokumi taste. They then used electrical recordings from intact chorda tympani nerves in anesthetized rats to demonstrate that ornithine enhanced MSG-induced responses following the application of tastants to the anterior surface of the tongue. They went on to show that this enhanced response was insensitive to amiloride, selected to inhibit 'salt tastant' responses mediated by the epithelial Na+ channel, but eliminated by Calindol. Finally they performed immunohistochemistry on sections of rat tongue demonstrating C6A positive spindle-shaped cells in fungiform papillae that partially overlapped in its distribution with the IP3 type-3 receptor, used as a marker of Type-II cells, but not with (i) gustducin, the G protein partner of Tas1 receptors (T1Rs), used as a marker of a subset of type-II cells; or (ii) 5-HT (serotonin) and Synaptosome-associated protein 25 kDa (SNAP-25) used as markers of Type-III cells.

      At least two other receptors in addition to C6A might mediate taste responses to ornithine: (i) the CaSR, which binds and responds to multiple L-amino acids (Conigrave et al, PNAS 2000), and which has been previously reported to mediate kokumi taste (Ohsu et al., JBC 2010) as well as responses to Ornithine (Shin et al., Cell Signaling 2020); and (ii) T1R1/T1R3 heterodimers which also respond to L-amino acids and exhibit enhanced responses to IMP (Nelson et al., Nature 2001). These alternatives are appropriately discussed and, taken together, the experimental results favor the authors' interpretation that C6A mediates the Ornithine responses. The authors provide preliminary data in Suppl. 3 for the possibility of co-expression of C6A with the CaSR.

      Weaknesses:

      The authors point out that animal models pose some difficulties of interpretation in studies of taste and raise the possibility in the Discussion that umami substances may enhance the taste response to ornithine (Line 271, Page 9).

      Ornithine and umami substances interact to produce synergistic effects in both directions—ornithine enhances responses to umami substances, and vice versa. These effects may depend on the concentrations used, as described in the Discussion (pp. 9–10). Further studies are required to clarify the precise nature of this interaction.

      One issue that is not addressed, and could be usefully addressed in the Discussion, relates to the potential effects of kokumi substances on the threshold concentrations of key tastants such as glutamate. Thus, an extension of taste distribution to additional areas of the mouth (previously referred to as 'mouthfulness') and persistence of taste/flavor responses (previously referred to as 'continuity') could arise from a reduction in the threshold concentrations of umami and other substances that evoke taste responses.

      Thank you for this important suggestion. If ornithine reduces the threshold concentrations of tastants—including glutamate—and enhances their suprathreshold responses, then adding ornithine may activate additional taste cells. This effect could explain kokumi attributes such as an “extension of taste distribution” and possibly the “persistence of responses.” As shown in Fig. 2, the lowest concentrations used for each taste stimulus are near or below the thresholds, which indicates that threshold concentrations are reduced—especially for MSG and MPG. We have incorporated this possibility into the Discussion as follows (p.12):

      “Kokumi substances may reduce the threshold concentrations as well as they increase the suprathreshold responses of tastants. Once the threshold concentrations are lowered, additional taste cells in the oral cavity become activated, and this information is transmitted to the brain. As a result, the brain perceives this input as coming from a wider area of the mouth.”

      The status of one of the compounds used as an inhibitor of C6A, the gallate derivative EGCG, as a potential inhibitor of the CaSR or T1R1/T1R3 is unknown. It would have been helpful to show that a specific inhibitor of the CaSR failed to block the ornithine response.

      Thank you for this important comment. We attempted to identify a specific inhibitor of CaSR. Although we considered using NPS-2143—a commonly used CaSR inhibitor—it is known to also inhibit GPRC6A. We agree that using a specific CaSR inhibitor would be beneficial and plan to pursue this in future studies.

      It would have been helpful to include a positive control kokumi substance in the two bottle preference experiment (e.g., one of the known gamma glutamyl peptides such as gamma-glu-Val-Gly or glutathione), to compare the relative potencies of the control kokumi compound and Ornithine, and to compare the sensitivities of the two responses to C6A and CaSR inhibitors.

      We agree with this comment. In retrospect, it may have been advantageous to directly compare the potencies of CaSR and GPRC6A agonists in enhancing taste preferences—and to evaluate the sensitivity of these preferences to CaSR and GPRC6A antagonists. However, we did not include γ-Glu-Val-Gly in the present study because we have already reported its supplementation effects on the ingestion of basic taste solutions in rats using the same methodology in a separate paper (Yamamoto and Mizuta, 2022, Ref. #25). The results from both studies are compared in the Discussion (p. 11).

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      Major:

      I am not convinced by the Author's arguments for including the human data. I appreciate their efforts in adding a few (5) subjects and improving the description, but it still feels like it is shoehorned into this paper, and would be better published as a different manuscript.

      This human study is short, but it is complete rather than preliminary. The rationale for us to include the human data as supplementary information is shown in responses to the reviewer’s Public review.

      Minor concerns:

      Page 3 paragraph 1: Suggest "contributing to palatability".

      Thank you for this suggestion. We have rewritten the text as follows:

      “…, the brain further processes these sensations to evoke emotional responses, contributing to palatability or unpleasantness”.

      Page 4 paragraph 2: The text still assumes that "kokumi" is a meaningful descriptor for what rodents experience. Re-wording the following sentence like this could help:

      "Neuroscientific studies in mice and rats provide evidence that gluthione and y-Glu-Val-Gly activate CaSRs, and modify behavioral responses to other tastants in a way that may correspond to kokumi taste as experienced by humans. However, to our..."

      Or something similar.

      Thank you for this suggestion. We have rewritten the sentence according to your suggestion as follows:

      "Neuroscientific studies (23,25,30) in mice and rats provide evidence that glutathione and y-Glu-Val-Gly activate CaSRs, and modify behavioral responses to other tastants in a way that may correspond to kokumi as experienced by humans”.

      Page 7 paragraph 1 - put the concentrations of Calindol and EGCG used (in the physiology exps) in the text.

      We have added the concentrations: “300 µM calindol and 100 µM EGCG”.

      Reviewer #2 (Recommendations for the authors):

      I have included all of my recommendations in the public review section.

      Reviewer #3 (Recommendations for the authors):

      Although the definitions of 'thickness', 'mouthfulness' and 'continuity' have been revised very helpfully in the Introduction, 'mouthfulness' reappears at other points in the MS e.g., Page 4, Results, Line 3; Page 9, Line 3. It is best replaced by the new definition in these other locations too.

      We wish to clarify that our revised text stated, “…to clarify that kokumi attributes are inherently gustatory, in the present study we use the terms ‘intensity of whole complex tastes (rich flavor with complex tastes)’ instead of ‘thickness,’ ‘mouthfulness (spread of taste and flavor throughout the oral cavity)’ instead of ‘continuity,’ and ‘persistence of taste (lingering flavor)’ instead of ‘continuity.’” The term “mouthfulness” was retained in our text, though we provided a more specific explanation. In the re-revised version, we have added “(spread of taste in the oral cavity)” immediately after “mouthfulness.”

      I doubt that many scientific readers will be familliar with the term 'intragemmal nerve fibres' (Page 8, Line 4). It is used appropriately but it would be helpful to briefly define/explain it.

      We have added an explanation as follows:

      “… intragemmal nerve fibers, which are nerve processes that extend directly into the structure of the taste bud to transmit taste signals from taste cells to the brain.”

      I previously pointed out the overlap between the CaSR's amino acid (AA) and gamma-glutamyl-peptide binding site. I was surprised by the authors' response which appeared to miss the point being made. It was based on the impacts of selected mutations in the receptor's Venus FlyTrap domain (Broadhead JBC 2011) on the responses to AAs and glutathione analogs. The significantly more active analog, S-methylglutathione is of additional interest because, like glutathione itself, it is present in mammalian body fluids. My apologies to the authors for not more carefully explaining this point.

      Thank you for this comment. Both CaSR and GPRC6A are recognized as broad-spectrum amino acid sensors; however, their agonist profiles differ. Aromatic amino acids preferentially activate CaSR, whereas basic amino acids tend to activate GPRC6A. For instance, among basic amino acids, ornithine is a potent and specific activator of GPRC6A, while γ-Glu-Val-Gly in addition to amino acids is a high-potency activator of CaSR. It remains unclear how effectively ornithine activates CaSR and whether γ-glutamyl peptides also activate GPRC6A. These questions should be addressed in future studies.

    1. eLife Assessment

      This valuable study uses consensus-independent component analysis to highlight transcriptional components (TC) in high-grade serous ovarian cancers (HGSOC). The study presents a convincing preliminary finding by identifying a TC linked to synaptic signaling that is associated with shorter overall survival in HGSOC patients, highlighting the potential role of neuronal interactions in the tumour microenvironment. This finding is corroborated by comparing spatially resolved transcriptomics in a small-scale study; a weakness is it being descriptive, non-mechanistic, and requires experimental validation.

    2. Reviewer #1 (Public review):

      Summary:

      This manuscript explores the transcriptional landscape of high-grade serous ovarian cancer (HGSOC) using consensus-independent component analysis (c-ICA) to identify transcriptional components (TCs) associated with patient outcomes. The study analyzes 678 HGSOC transcriptomes, supplemented with 447 transcriptomes from other ovarian cancer types and noncancerous tissues. By identifying 374 TCs, the authors aim to uncover subtle transcriptional patterns that could serve as novel drug targets. Notably, a transcriptional component linked to synaptic signaling was associated with shorter overall survival (OS) in patients, suggesting a potential role for neuronal interactions in the tumor microenvironment. Given notable weaknesses like lack of validation cohort or validation using other platforms (other than the 11 samples with ST), the data is considered highly descriptive and preliminary.

      The study reveals significant findings by identifying a transcriptional component (TC121) associated with synaptic signaling, which is linked to shorter survival in patients with high-grade serous ovarian cancer, highlighting the potential role of neurons in the tumor microenvironment. However, the evidence could be strengthened by experimental validation to confirm the functional roles of key genes within TC121 and further exploration of its spatial aspects, including deeper analysis of neuronal and synaptic and other neuronal gene expression.

      Strengths:

      Innovative Methodology:<br /> The use of c-ICA to dissect bulk transcriptomes into independent components is a novel approach that allows for the identification of subtle transcriptional patterns that may be overshadowed in traditional analyses.

      Comprehensive Data Integration:<br /> The study integrates a large dataset from multiple public repositories, enhancing the robustness of the findings. The inclusion of spatially resolved transcriptomes adds a valuable dimension to the analysis.

      Clinical Relevance:<br /> The identification of a synaptic signaling-related TC associated with poor prognosis highlights a potential new avenue for therapeutic intervention, emphasizing the role of the tumor microenvironment in cancer progression.

      Weaknesses:

      Mechanistic Insights:<br /> While the study identifies TCs associated with survival, it provides limited mechanistic insights into how these components influence cancer progression. Further experimental validation is necessary to elucidate the underlying biological processes.

      Generalizability:<br /> The findings are primarily based on transcriptomic data from HGSOC. It remains unclear how these results apply to other subtypes of ovarian cancer or different cancer types.

      Innovative Methodology:<br /> Requires more validation using different platforms (IHC) to validate the performance of this bulk derived data. Also, the lack of control on data quality is a concern.

      Clinical Application:<br /> Although the study suggests potential drug targets, the translation of these findings into clinical practice is not addressed. Probably given lack of some QA/QC procedures it'll be hard to translate these results. Future studies should focus on validating these targets in clinical settings.

    3. Reviewer #2 (Public review):

      Summary:

      Consensus-independent component analysis and closely related methods have previously been used to reveal components of transcriptomic data which are not captured by principal component or gene-gene coexpression analyses.

      Here, the authors asked whether applying consensus-independent component analysis (c-ICA) to published high-grade serous ovarian cancer (HGSOC) microarray-based transcriptomes would reveal subtle transcriptional patterns which are not captured by existing molecular omics classifications of HGSOC.

      Statistical associations of these (hitherto masked) transcriptional components with prognostic outcomes in HGSOC would lead to additional insights into underlying mechanisms and, coupled with corroborating evidence from spatial transcriptomics, are proposed for further investigation.

      This approach is complementary to existing transcriptomics classifications of HGSOC.

      The authors have previously applied the same approach in colorectal carcinoma (for example, Knapen et al. (2024) Commun. Med).

      Strengths:

      Overall, this study describes a solid data-driven description of c-ICA-derived transcriptional components that the authors identified in HGSOC microarray transcriptomics data, supported by detailed methods and supplementary documentation.

      The biological interpretation of transcriptional components is convincing based on (data-driven) permutation analysis and a suite of analyses of association with copy-number, gene sets, and prognostic outcomes.<br /> The resulting annotated transcriptional components have been made available in a searchable online format.

      For the highlighted transcriptional component which has been annotated as related to synaptic signalling, the detection of the transcriptional component among 11 published spatial transcriptomics samples from ovarian cancers is compelling and supports the need for further mechanistic follow-up.

      Further comments:

      This revised version includes a suite of comparisons between the c-ICA-derived components and existing published transcriptomic/genomic-based classifications of ovarian cancers. Newly described components will require experimental validation, as acknowledged by the authors.

      Here, the authors primarily interpret the c-ICA transcriptional components as a deconvolution of bulk transcriptomics due to the presence of cells from tumour cells and the tumour microenvironment.<br /> In this revised version, the authors additionally investigate their TC scores in single cells from a published HGSOC single-cell RNAseq dataset, highlighting examples of TC scores within and between cell types.

      c-ICA is not explicitly a deconvolution method with respect to cell types: the transcriptional components do not necessarily correspond to distinct cell types, and may reflect differential dysregulation within a cell type. This application of c-ICA for the purpose of data-driven deconvolution of cell populations is distinct from other deconvolution methods which explicitly use a prior cell signature matrix.

    4. Author response:

      The following is the authors’ response to the original reviews

      eLife Assessment

      This valuable study uses consensus-independent component analysis to highlight transcriptional components (TC) in high-grade serous ovarian cancers (HGSOC). The study presents a convincing preliminary finding by identifying a TC linked to synaptic signaling that is associated with shorter overall survival in HGSOC patients, highlighting the potential role of neuronal interactions in the tumour microenvironment. This finding is corroborated by comparing spatially resolved transcriptomics in a small-scale study; a weakness is in being descriptive, non-mechanistic, and requiring experimental validation.”

      We sincerely thank the editors for their valuable and constructive feedback. We are grateful for the recognition of our findings and the importance of identifying transcriptional components in high-grade serous ovarian cancers.

      We acknowledge the editors’ observation regarding the descriptive nature of our study and its limited mechanistic depth. We agree that additional experimental validation would further strengthen our conclusions. We are planning and executing the experiments for a future study to provide mechanistic insights into the associations found in this study. In addition, recent reviews focused on the emerging field of cancer neuroscience emphasize the early stages the field is in, specifically in terms of a mechanistic understanding of the contributions of tumor-infiltrating nerves in tumor initiation and progression (Amit et al., 2024; Hwang et al., 2024). Nonetheless, we wish to emphasize that emerging mechanistic preclinical studies have demonstrated the influence of tumour-infiltrating nerves on disease progression (Allen et al., 2018; Balood et al., 2022; Darragh et al., 2024; Globig et al., 2023; Jin et al., 2022; Restaino et al., 2023; Zahalka et al., 2017). Several of these studies include contributions from our co-authors and feature in vitro and in vivo research on head and neck squamous cell carcinoma as well as high-grade serous ovarian carcinoma samples. This study further strengthens the preclinical work by showing in patient data, the potential relevance of neuronal signaling on disease outcome.

      For instance, Restiano et al. (2023) demonstrated that substance P, released from tumour-infiltrating nociceptors, potentiates MAP kinase signaling in cancer cells, thereby driving disease progression. Crucially, this effect was shown to be reversible in vivo by blocking the substance P receptor (Restaino et al., 2023). These findings offer compelling evidence of the role of tumour innervation in cancer biology.

      Our current study in tumor samples of patients with high-grade serous ovarian cancer identifies a transcriptional component that is enriched for genes for which the protein is located in the synapse. We believe that the previously published mechanistic insights support our findings and suggest that this transcriptional component could serve as a valuable screening tool to identify innervated tumours based on bulk transcriptomes. Clinically, this information is highly relevant, as patients with innervated tumours may benefit from alternate therapeutic strategies targeting these innervations.

      Reviewer #1 (Public review)

      This manuscript explores the transcriptional landscape of high-grade serous ovarian cancer (HGSOC) using consensus-independent component analysis (c-ICA) to identify transcriptional components (TCs) associated with patient outcomes. The study analyzes 678 HGSOC transcriptomes, supplemented with 447 transcriptomes from other ovarian cancer types and noncancerous tissues. By identifying 374 TCs, the authors aim to uncover subtle transcriptional patterns that could serve as novel drug targets. Notably, a transcriptional component linked to synaptic signaling was associated with shorter overall survival (OS) in patients, suggesting a potential role for neuronal interactions in the tumour microenvironment. Given notable weaknesses like lack of validation cohort or validation using another platform (other than the 11 samples with ST), the data is considered highly descriptive and preliminary.

      Strengths:

      (1) Innovative Methodology:

      The use of c-ICA to dissect bulk transcriptomes into independent components is a novel approach that allows for the identification of subtle transcriptional patterns that may be overshadowed in traditional analyses.

      We thank the reviewer for recognizing the strengths and novelty of our study. We appreciate the positive feedback on using consensus-independent component analysis (c-ICA) to decompose bulk transcriptomes, which allowed us to detect subtle transcriptional signals often overlooked in traditional analyses.

      (2) Comprehensive Data Integration:

      The study integrates a large dataset from multiple public repositories, enhancing the robustness of the findings. The inclusion of spatially resolved transcriptomes adds a valuable dimension to the analysis.

      We thank the reviewer for recognizing the robustness of our study through comprehensive data integration. We appreciate the acknowledgment of our efforts to leverage a large, multi-source dataset, as well as the additional insights gained from spatially resolved transcriptomes. We consider this integrative approach enhances the depth of our analysis and contributes to a more nuanced understanding of the tumour microenvironment.

      (3) Clinical Relevance:

      The identification of a synaptic signaling-related TC associated with poor prognosis highlights a potential new avenue for therapeutic intervention, emphasizing the role of the tumour microenvironment in cancer progression.

      We appreciate the recognition of the clinical implications of our findings. The identification of a synaptic signaling-related transcriptional component associated with poor prognosis underscores the potential for novel therapeutic targets within the tumour microenvironment. We agree that this insight could open new avenues for intervention and further highlights the role of neuronal interactions in cancer progression.

      Weaknesses:

      (1) Mechanistic Insights:

      While the study identifies TCs associated with survival, it provides limited mechanistic insights into how these components influence cancer progression. Further experimental validation is necessary to elucidate the underlying biological processes.

      We acknowledge the point regarding the limited mechanistic insights provided in our study. We agree that further experimental validation would significantly enhance our understanding of how the biological processes captured by these transcriptional components influence cancer progression. We are planning and executing the experiments for  a future study to provide mechanistic insights into the associations found in this study.

      Our analyses were performed on publicly available bulk and spatial resolved expression profiles. To investigate the mechanistic insights in future studies, we plan to integrate spatial transcriptomic data with immunohistochemical analysis of the same tumour samples to validate our findings. Additionally, we have initiated efforts to set up in vitro co-cultures of neurons and ovarian cancer cells. These co-cultures will enable us to investigate how synaptic signaling impacts ovarian cancer cell behavior.

      (2) Generalizability:

      The findings are primarily based on transcriptomic data from HGSOC. It remains unclear how these results apply to other subtypes of ovarian cancer or different cancer types.

      To respond to this remark, we utilized survival data from Bolton et al. (2022) and TCGA to investigate associations between TC activity scores and overall survival of patients with ovarian clear cell carcinoma, the second most common subtype of epithelial ovarian cancer, and  other cancer types respectively. However, we acknowledge the limitations of TCGA survival data, as highlighted in the referenced article (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8726696/). Additionally, as shown in Figure 5, we provided evidence of TC121 activity across various cancer types, suggesting broader relevance. For the results of the analyses mentioned above, please refer to our response to remark 1.3 of the recommendation section (page 4).

      (3) Innovative Methodology:

      Requires more validation using different platforms (IHC) to validate the performance of this bulk-derived data. Also, the lack of control over data quality is a concern.

      We acknowledge the value of validating our results with alternative platforms such as IHC. We are planning and executing the experiments for a future study to provide mechanistic insights into the associations found in this study.

      We implemented regarding data quality control, the following measures to ensure the reliability of our analysis:

      Bulk Transcriptional Profiles: To assess data quality, we conducted principal component analysis (PCA) on the sample Pearson product-moment correlation matrix. The first principal component (PCqc), which explains approximately 80-90% of the variance, was used to distinguish technical variability from biological signals (Bhattacharya et al., 2020). Samples with a correlation coefficient below 0.8 relative to PCqc were identified as outliers and excluded. Additionally, MD5 hash values were generated for each CEL file to identify and remove duplicate samples. Expression values were standardized to a mean of zero and a variance of one for each gene to minimize probeset- or gene-specific variability across datasets (GEO, CCLE, GDSC, and TCGA).

      Spatial Transcriptional Profiles: PCA was also applied to spatial transcriptomic data for quality control. Only samples with consistent loading factor signs for the first principal component across all individual spot profiles were retained. Samples failing this criterion were excluded from further analyses.

      (4) Clinical Application:

      Although the study suggests potential drug targets, the translation of these findings into clinical practice is not addressed. Probably given the lack of some QA/QC procedures it'll be hard to translate these results. Future studies should focus on validating these targets in clinical settings.”

      Regarding clinical applications, we acknowledge the importance of further exploring strategies targeting synaptic signaling and neurotransmitter release in the tumour microenvironment (TME). As partially discussed in the first version of the manuscript, drugs such as ifenprodil and lamotrigine—commonly used to treat neuronal disorders—can block glutamate release, thereby inhibiting subsequent synaptic signaling. Additionally, the vesicular monoamine transporter (VMAT) inhibitor reserpine blocks the formation of synaptic vesicles (Reid et al., 2013; Williams et al., 2001). Previous in vitro studies with HGSOC cell lines demonstrated that ifenprodil significantly reduced cancer cell proliferation, while reserpine triggered apoptosis in cancer cells (North et al., 2015; Ramamoorthy et al., 2019). The findings highlight the potential of such approaches to disrupt synaptic neurotransmission in the TME.

      To address potential translation of our findings into clinical practice more comprehensively, we have included additional details in the manuscript:

      Section discussion, page 16, lines 338-341:

      “This interaction can be targeted with pan-TRK inhibitors such as entrectinib and larotrectinib. Both drugs are showing promising results in multiple phase II trials, including ovarian cancer and breast cancer patients. Furthermore, a TRKB-specific inhibitor was developed (ANA-12), but has not been subjected to any clinical trials in cancer so far (Ardini et al., 2016; Burris et al., 2015; Drilon et al., 2018, 2017).”

      On page 17, lines 361-374:

      “Strategies to disrupt neuronal signaling and neurotransmitter release in neurons target key elements of excitatory neurotransmission, such as calcium flux and vesicle formation. Drugs like ifenprodil and lamotrigine, commonly used to treat neuronal disorders, block glutamate release and subsequent neuronal signaling. Additionally, the vesicular monoamine transporter (VMAT) inhibitor reserpine prevents synaptic vesicle formation (Reid et al., 2013; Williams, 2001). In vitro studies with HGSOC cell lines have demonstrated that ifenprodil significantly inhibits tumour proliferation, while reserpine induces apoptosis in cancer cells (North et al., 2015; Ramamoorthy et al., 2019). These approaches hold promise for inhibiting neuronal signaling and interactions in the TME.”

      Reviewer #2 (Public review):

      Summary:

      Consensus-independent component analysis and closely related methods have previously been used to reveal components of transcriptomic data that are not captured by principal component or gene-gene coexpression analyses.

      Here, the authors asked whether applying consensus-independent component analysis (c-ICA) to published high-grade serous ovarian cancer (HGSOC) microarray-based transcriptomes would reveal subtle transcriptional patterns that are not captured by existing molecular omics classifications of HGSOC.

      Statistical associations of these (hitherto masked) transcriptional components with prognostic outcomes in HGSOC could lead to additional insights into underlying mechanisms and, coupled with corroborating evidence from spatial transcriptomics, are proposed for further investigation.

      This approach is complementary to existing transcriptomics classifications of HGSOC.

      The authors have previously applied the same approach in colorectal carcinoma (Knapen et al. (2024) Commun. Med).

      Strengths:

      (1) Overall, this study describes a solid data-driven description of c-ICA-derived transcriptional components that the authors identified in HGSOC microarray transcriptomics data, supported by detailed methods and supplementary documentation.

      We thank the reviewer for acknowledging the strength of our data-driven approach and the use of consensus-independent component analysis (c-ICA) to identify transcriptional components within HGSOC microarray data. We aimed to provide comprehensive methodological detail and supplementary documentation to support the reproducibility and robustness of our findings. We believe this approach allows for the identification of subtle transcriptional signals that might have been overlooked by traditional analysis methods.

      (2) The biological interpretation of transcriptional components is convincing based on (data-driven) permutation analysis and a suite of analyses of association with copy-number, gene sets, and prognostic outcomes.

      We appreciate the positive feedback on the biological interpretation of our transcriptional components. We are pleased that our approach, which includes data-driven permutation testing and analyses of associations with copy-number alterations, gene sets, and prognostic outcomes, was found to be convincing. These analyses were integral to enhancing our findings’ robustness and biological relevance.

      (3) The resulting annotated transcriptional components have been made available in a searchable online format.

      Thank you for this important positive remark.

      (4) For the highlighted transcriptional component which has been annotated as related to synaptic signalling, the detection of the transcriptional component among 11 published spatial transcriptomics samples from ovarian cancers appears to support this preliminary finding and requires further mechanistic follow-up.

      Thank you for acknowledging the accessibility of our annotated transcriptional components. We prioritized making these data available in a searchable online format to facilitate further research and enable the community to explore and validate our findings.

      Weaknesses:

      (1) This study has not explicitly compared the c-ICA transcriptional components to the existing reported transcriptional landscape and classifications for ovarian cancers (e.g. Smith et al Nat Comms 2023; TCGA Nature 2011; Engqvist et al Sci Rep 2020) which would enable a further assessment of the additional contribution of c-ICA - whether the cICA approach captured entirely complementary components, or whether some components are correlated with the existing reported ovarian transcriptomic classifications.

      We acknowledge the reviewer’s insightful suggestion to compare our c-ICA-derived transcriptional components with previously reported ovarian cancer classifications, such as those from Smith et al. (2023), TCGA (2011), and Engqvist et al. (2020). To address this, we incorporated analyses comparing the activity scores of our transcriptional components with these published landscapes and classifications, particularly focusing on any associations with overall survival. Additionally, we evaluated correlations between gene signatures from a subset of these studies and our identified TCs, enhancing our understanding of the unique contributions of the c-ICA approach. Please refer to our response to remark 10 for the results of these analyses.

      (2) Here, the authors primarily interpret the c-ICA transcriptional components as a deconvolution of bulk transcriptomics due to the presence of cells from tumour cells and the tumour microenvironment.

      However, c-ICA is not explicitly a deconvolution method with respect to cell types: the transcriptional components do not necessarily correspond to distinct cell types, and may reflect differential dysregulation within a cell type. This application of c-ICA for the purpose of data-driven deconvolution of cell populations is distinct from other deconvolution methods that explicitly use a prior cell signature matrix.”

      We acknowledge that c-ICA, unlike traditional deconvolution methods, is not specifically designed for cell-type deconvolution and does not rely on a predefined cell signature matrix. While we explored the transcriptional components in the context of tumour and microenvironmental interactions, we agree that these components may not correspond directly to distinct cell types but rather reflect complex patterns of dysregulation, potentially within individual cell populations.

      Our goal with c-ICA was to uncover hidden transcriptional patterns possibly influenced by cellular heterogeneity. However, we recognize these patterns may also arise from regulatory processes within a single cell type. To investigate further, we used single-cell transcriptional data (~60,000 cell-types annotated profiles from GSE158722) and projected our transcriptional components onto these profiles to obtain activity scores, allowing us to assess each TC’s behavior across diverse cellular contexts after removing the first principal component to minimize background effects. Please refer to our response to remark 2.2 in the recommendations to the authors (page 14) for the results of this analysis.

      References

      Allen JK, Armaiz-Pena GN, Nagaraja AS, Sadaoui NC, Ortiz T, Dood R, Ozcan M, Herder DM, Haemerrle M, Gharpure KM, Rupaimoole R, Previs R, Wu SY, Pradeep S, Xu X, Han HD, Zand B, Dalton HJ, Taylor M, Hu W, Bottsford-Miller J, Moreno-Smith M, Kang Y, Mangala LS, Rodriguez-Aguayo C, Sehgal V, Spaeth EL, Ram PT, Wong ST, Marini FC, Lopez-Berestein G, Cole SW, Lutgendorf SK, diBiasi M, Sood AK. 2018. Sustained adrenergic signaling promotes intratumoral innervation through BDNF induction. Cancer Res 78 (12):3233-3242.

      Ardini E, Menichincheri M, Banfi P, Bosotti R, Ponti CD, Pulci R, Ballinari D, Ciomei M, Texido G, Degrassi A, Avanzi N, Amboldi N, Saccardo MB, Casero D, Orsini P, Bandiera T, Mologni L, Anderson D, Wei G, Harris J, Vernier J-M, Li G, Felder E, Donati D, Isacchi A, Pesenti E, Magnaghi P, Galvani A. 2016. Entrectinib, a Pan–TRK, ROS1, and ALK Inhibitor with activity in multiple molecularly defined cancer Indications. Mol Cancer Ther 15:628–639.

      Balood M, Ahmadi M, Eichwald T, Ahmadi A, Majdoubi A, Roversi Karine, Roversi Katiane, Lucido CT, Restaino AC, Huang S, Ji L, Huang K-C, Semerena E, Thomas SC, Trevino AE, Merrison H, Parrin A, Doyle B, Vermeer DW, Spanos WC, Williamson CS, Seehus CR, Foster SL, Dai H, Shu CJ, Rangachari M, Thibodeau J, Rincon SVD, Drapkin R, Rafei M, Ghasemlou N, Vermeer PD, Woolf CJ, Talbot S. 2022. Nociceptor neurons affect cancer immunosurveillance. Nature 611:405–412.

      Bhattacharya A, Bense RD, Urzúa-Traslaviña CG, Vries EGE de, Vugt MATM van, Fehrmann RSN. 2020. Transcriptional effects of copy number alterations in a large set of human cancers. Nat Commun 11:715.

      Burris HA, Shaw AT, Bauer TM, Farago AF, Doebele RC, Smith S, Nanda N, Cruickshank S, Low JA, Brose MS. 2015. Abstract 4529: Pharmacokinetics (PK) of LOXO-101 during the first-in-human Phase I study in patients with advanced solid tumors: Interim update. Cancer Res 75:4529–4529.

    1. eLife Assessment

      Xenacoelomorpha is an enigmatic phylum, displaying various presumably simple or ancestral bilaterian features. This valuable study characterises the reproductive life history of Hofstenia miamia, a member of class Acoela in this phylum. The authors describe the morphology and development of the reproductive system, its changes upon degrowth and regeneration, and the animals' egg-laying behaviour. The evidence is convincing, with fluorescent microscopy and quantitative measurements as a considerable improvement to historical reports based mostly on histology and qualitative observations.

    2. Reviewer #1 (Public review):

      The aim of this study was a better understanding of the reproductive life history of acoels. The acoel Hofstenia miamia, an emerging model organism, is investigated; the authors nevertheless acknowledge and address the high variability in reproductive morphology and strategies within Acoela.

      The morphology of male and female reproductive organs in these hermaphroditic worms is characterised through stereo microscopy, immunohistochemistry, histology, and fluorescent in situ hybridization. The findings confirm and better detail historical descriptions. A novelty in the field is the in situ hybridization experiments, which link already published single-cell sequencing data to the worms' morphology. An interesting finding, though not further discussed by the authors, is that the known germline markers cgnl1-2 and Piwi-1 are only localized in the ovaries and not in the testes.

      The work also clarifies the timing and order of appearance of reproductive organs during development and regeneration, as well as the changes upon de-growth. It shows an association of reproductive organ growth to whole body size, which will be surely taken into account and further explored in future acoel studies. This is also the first instance of non-anecdotal degrowth upon starvation in H. miamia (and to my knowledge in acoels, except recorded weight upon starvation in Convolutriloba retrogemma [1]).

      Egg laying through the mouth is described in H. miamia for the first time as well as the worms' behavior in egg laying, i.e. choosing the tanks' walls rather than its floor, laying eggs in clutches, and delaying egg-laying during food deprivation. Self-fertilization is also reported for the first time.

      The main strength of this study is that it expands previous knowledge on the reproductive life history traits in H. miamia and it lays the foundation for future studies on how these traits are affected by various factors, as well as for comparative studies within acoels. As highlighted above, many phenomena are addressed in a rigorous and/or quantitative way for the first time. This can be considered the start of a novel approach to reproductive studies in acoels, as the authors suggest in the conclusion. It can be also interpreted as a testimony of how an established model system can benefit the study of an understudied animal group.

      The main weakness of the work is the lack of convincing explanations on the dynamics of self-fertilization, sperm storage, and movement of oocytes from the ovaries to the central cavity and subsequently to the pharynx. These questions are also raised by the authors themselves in the discussion. Another weakness (or rather missing potential strength) is the limited focus on genes. Given the presence of the single-cell sequencing atlas and established methods for in situ hybridization and even transgenesis in H. miamia, this model provides a unique opportunity to investigate germline genes in acoels and their role in development, regeneration, and degrowth. It should also be noted that employing Transmission Electron Microscopy would have enabled a more detailed comparison with other acoels, since ultrastructural studies of reproductive organs have been published for other species (cfr e.g. [2],[3],[4]). This is especially true for a better understanding of the relation between sperm axoneme and flagellum (mentioned in the Results section), as well as of sexual conflict (mentioned in the Discussion).

      (1) Shannon, Thomas. 2007. 'Photosmoregulation: Evidence of Host Behavioral Photoregulation of an Algal Endosymbiont by the Acoel Convolutriloba Retrogemma as a Means of Non-Metabolic Osmoregulation'. Athens, Georgia: University of Georgia [Dissertation].<br /> (2) Zabotin, Ya. I., and A. I. Golubev. 2014. 'Ultrastructure of Oocytes and Female Copulatory Organs of Acoela'. Biology Bulletin 41 (9): 722-35.<br /> (3) Achatz, Johannes Georg, Matthew Hooge, Andreas Wallberg, Ulf Jondelius, and Seth Tyler. 2010. 'Systematic Revision of Acoels with 9+0 Sperm Ultrastructure (Convolutida) and the Influence of Sexual Conflict on Morphology'.<br /> (4) Petrov, Anatoly, Matthew Hooge, and Seth Tyler. 2006. 'Comparative Morphology of the Bursal Nozzles in Acoels (Acoela, Acoelomorpha)'. Journal of Morphology 267 (5): 634-48.

    3. Reviewer #2 (Public review):

      Summary:

      While the phylogenetic position of Acoels (and Xenacoelomorpha) remains still debated, investigations of various representative species are critical to understanding their overall biology.

      Hofstenia is an Acoels species that can be maintained in laboratory conditions and for which several critical techniques are available. The current manuscript provides a comprehensive and widely descriptive investigation of the productive system of Hofstenia miamia.

      Strengths:

      (1) Xenacoelomorpha is a wide group of animals comprising three major clades and several hundred species, yet they are widely understudied. A comprehensive state-of-the-art analysis on the reprodutive system of Hofstenia as representative is thus highly relevant.

      (2) The investigations are overall very thorough, well documented, and nicely visualised in an array of figures. In some way, I particularly enjoyed seeing data displayed in a visually appealing quantitative or semi-quantitative fashion.

      (3) The data provided is diverse and rich. For instance, the behavioral investigations open up new avenues for further in-depth projects.

      Weaknesses:

      While the analyses are extensive, they appear in some way a little uni-dimensional. For instance the two markers used were characterized in a recent scRNAseq data-set of the Srivastava lab. One might have expected slightly deeper molecular analyses. Along the same line, particularly the modes of spermatogenesis or oogenesis have not been further analysed, nor the proposed mode of sperm-storage.

    4. Author response:

      We thank the reviewers for their evaluation, for helpful suggestions to improve clarity and accuracy, and for their positive reception of the manuscript. We will incorporate their suggestions in a revised manuscript. Here, we respond to their major comments. 

      The reviewers suggest that a molecular study of Hofstenia’s reproductive systems would be beneficial, as would mechanistic explanations for its unusual reproductive behavior. We agree with the reviewers that both of these would be interesting avenues, although we think this is outside the scope of this current manuscript. This manuscript studies growth and reproductive dynamics in acoels, and establishes a foundation to study its underlying molecular, developmental, and physiological machinery. 

      Our previous molecular work, using scRNAseq and FISH, identified several germline markers. Here, we show that two of them are specific markers of testes and ovaries, respectively. This, together, with our new anatomical data, allows us to identify the expression domains of most of these other markers more clearly. Some markers may be expressed in a presumptive common germline that eventually splits into an anterior male germline and posterior female germline. We agree with the reviewers that understanding the dynamics of germline differentiation and its molecular genetic underpinnings would be very interesting, and we hope to address this in future work. 

      As the reviewers note, we do not understand how sperm is stored, how the worm’s own sperm can travel to its ovaries to enable selfing, or how eggs in the ovaries travel within the body. We agree with the reviewers that understanding these processes would be very interesting. Our histological and molecular work so far has been unable to find tube-like structures or other cavities for storage and transport. Potentially, cells could move within the parenchyma. Explaining these events will require substantial effort (including mechanistic studies of cell behavior and ultrastructural studies that the reviewers suggest), and we hope to do this in future work. 

      We agree with Reviewer 1 that it is interesting that Piwi-1 expression is only observed in the ovaries and not in the testes - unusual given its broad germline expression in many taxa. Although there are several possible explanations for this finding (for eg. Piwi-1 could be expressed at low levels in male germline, perhaps other Piwi proteins are expressed in male germline, or Piwi may play roles in male germline progenitors that are not co-located with maturing sperm, etc), we do not currently know why this is so, and we will discuss these possibilities in our revised manuscript.

    1. eLife Assessment

      The study presents valuable findings on the role of Aff3ir, a gene implicated in flow-induced atherosclerosis and regulating the inflammation-associated transcription factor, IRF5. The in vivo data are solid in providing evidence on the role of Aff3ir in shear stress and formation of atheromatous plaques. The work will be of interest to clinical researchers and biologists focusing on inflammation and atherosclerosis in cardiovascular disease with a broad eLife readership.

    2. Reviewer #1 (Public review):

      Summary:

      The authors report the role of a novel gene Aff3ir-ORF2 in flow induced atherosclerosis. They show that the gene is anti-inflammatory in nature. It inhibits the IRF5 mediated athero-progression by inhibiting the causal factor (IRF5). Furthermore, authors show a significant connection between shear stress and Aff3ir-ORF2 and its connection to IRF5 mediated athero-progression in different established mice models which further validates the ex vivo findings.

      Strengths:

      (1) Adequate number of replicates were used for this study.<br /> (2) Both in vitro and in vivo validation was done.<br /> (3) Figures are well presented<br /> (4) In vivo causality is checked with cleverly designed experiments

      Weaknesses:

      (1) Inflammatory proteins must be measured with standard methods e.g ELISA as mRNA level and protein level does not always correlate.<br /> (2) RNA seq analysis has to be done very carefully. How does the euclidean distance correlate with the differential expression of genes. Do they represent neighborhood? If they do how does this correlation affect the conclusion of the paper?<br /> (3) Volcano plot does not indicate q value of the shown genes. It is advisable to calculate q value for each of the genes which represents the FDR probability of the identified genes.<br /> (4) GO enrichment was done against Global gene set or local geneset? Authors should provide more detailed information about the analysis.<br /> (5) If the analysis was performed against global gene set. How does that connect with this specific atherosclerotic microenvironment?<br /> (6) what was the basal expression of genes and how does the DGE (differential gene expression) values differ?<br /> (7) How did IRF5 picked from GO analysis? was it within 20 most significant genes?<br /> (8) Microscopic studies should be done more carefully? There seems to be a global expression present on the vascular wall for Aff3ir-ORF2 and the expression seems to be similar like AFF3 in fig 1.

      Comments on Revision:

      The authors have adequately addressed my concerns.

    3. Reviewer #2 (Public review):

      Summary:

      The authors recently uncovered a novel nested gene, Aff3ir, and this work sets out to study its function in endothelial cells further. Based on differences in expression correlating with areas of altered shear stress, they investigate a role for the isoform Aff3ir-ORF2 in endothelial activation and development of atherosclerosis downstream of disturbed shear stress. Using a knockout mouse model and in vivo overexpression experiments, they demonstrate a strong potential for Aff3ir-ORF2 to alleviate atherosclerosis. They find that Aff3ir-ORF2 interacts with the pro-inflammatory transcription factor IRF5 and retains it in the cytoplasm, hence preventing upregulation of inflammation-associated genes. The data expands our knowledge of IRF5 regulation which could be relevant to researchers studying various inflammatory diseases as well as adding to our understand of atherosclerosis development.

      Strengths:

      The in vivo data is convincing using immunofluorescence staining to assess AFF3ir-ORF2 expression, a knockout mouse model, overexpression and knockdown studies and rescue experiments in combination with two atherosclerotic models to demonstrate that Aff3ir-ORF2 can lessen atherosclerotic plaque formation in ApoE-/- mice.

      Weaknesses:

      The effect on atherosclerosis is clear and there is sufficient evidence to conclude that this is the result of reduced endothelial cell activation. However, other cell types such as smooth muscle cells or macrophages could be contributing to the effects observed. The mouse model is a global knockout and the shRNA knockdowns (Fig. 5) and overexpression data in Figure 2 are not cell type-specific. Only the overexpression construct in Figure 6 uses an ICAM-2 promoter construct, which drives expression in endothelial cells, though leaky expression of this promoter has been reported in the literature.

      The in vitro experiments are solidly executed, but most experiments are performed in mouse embryonic fibroblasts (MEFs) and results extrapolated to endothelial cell responses. However, several key experiments are repeated in HUVEC, thereby making a solid case that Aff3ir-ORF2 can regulate IRF5 in both MEFs and HUVEC. It is important to note that the sequence of AFF3ir-ORF2 is not conserved in humans and lacks an initiation codon, hence the regulatory pathway is not conserved. However, the overexpression studies in HUVEC suggest that mouse AFF3ir-ORF2 can also regulate human IRF5 and hence the mechanism retains relevance for possible human health interventions.

      Overall, the paper succeeds in demonstrating a link between Aff3ir-ORF2 and atherosclerosis. The study shows a functional interaction between Aff3ir-ORF2 and IRF5 in embryonic fibroblasts, but makes a solid case that this mechanism is relevant for atherosclerosis development via endothelial cell activation.

    4. Author response:

      The following is the authors’ response to the original reviews

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      The authors report the role of a novel gene Aff3ir-ORF2 in flow-induced atherosclerosis. They show that the gene is anti-inflammatory in nature. It inhibits the IRF5-mediated athero-progression by inhibiting the causal factor (IRF5). Furthermore, the authors show a significant connection between shear stress and Aff3ir-ORF2 and its connection to IRF5 mediated athero-progression in different established mice models which further validates the ex vivo findings.

      Strengths:

      (1) An adequate number of replicates were used for this study.

      (2) Both in vitro and in vivo validation was done.

      (3) The figures are well presented.

      (4) In vivo causality is checked with cleverly designed experiments.

      We thank you for your positive remarks.

      Weaknesses:

      (1) Inflammatory proteins must be measured with standard methods e.g ELISA as mRNA level and protein level does not always correlate.

      Thanks. We have followed your advice and performed ELISA experiments to measure the concentrations of inflammatory cytokines, including IL-6 and IL-1β. The newly acquired results have been included in Figure 2E (Line 160-163) in the revised manuscript.

      (2) RNA seq analysis has to be done very carefully. How does the euclidean distance correlate with the differential expression of genes. Do they represent the neighborhood?

      If they do how does this correlation affect the conclusion of the paper?

      We thank the reviewer for this professional comments and apologize for the confusion. The heatmap using Euclidean distance was generated based on the expression levels of all differentially expressed genes (calculated with deseq2). Since its interpretation overlaps with the volcano plot presented in Figure 4B, we have moved the heatmap to Figure S5A in the revised manuscript and provided a detailed description in the figure legend (Lines 106-108 in the supporting information). Additionally, to better illustrate the variation among all samples, we have performed PCA analysis and included the new results in Figure 4A of the revised manuscript.

      (3) The volcano plot does not indicate the q value of the shown genes. It is advisable to calculate the q value for each of the genes which represents the FDR probability of the identified genes.

      Thank you for your careful review. We apologize for the incorrect labeling.

      It was P.adj value. The label for Figure 4B has been corrected in the revised manuscript. 

      (4) GO enrichment was done against the Global gene set or a local geneset? The authors should provide more detailed information about the analysis.

      Thank you. We performed GO enrichment analysis against the global gene set. The description of the results has been updated in the revised manuscript (Lines 222–224).

      (5) If the analysis was performed against a global gene set. How does that connect with this specific atherosclerotic microenvironment?

      Thank you for your insightful comments. We have followed your advice and investigated the functional characteristics of these differentially expressed genes in the context of the atherosclerotic microenvironment. The RNA-seq differential gene list was further mapped onto the atherosclerosis-related gene dataset (PMID: 27374120), resulting in 363 overlapping genes. The 363 genes were subjected to bioinformatics enrichment analysis using Gene Ontology (GO) databases. GO analysis of these genes revealed enrichment in processes related to cell−cell adhesion and leukocyte activation involved in immune response (Figure S5B), which is highly consistent with the observed effects of AFF3ir-ORF2 on VCAM-1 expression. The newly acquired data are presented in Figure S5B and the description of the results is included in the revised manuscript (Line 227-233).

      (6) What was the basal expression of genes and how did the DGE (differential gene expression) values differ?

      Thanks for the comments. The RNA-sequencing data has been submitted to GEO datasets (GSE286206), making the basal gene expression data available to readers.

      The differential expression analysis was performed using DESeq2 (v1.4.5) (PMID: 25516281) with a criterion of 1.5-fold change and P<0.05. We has included the description in the revised manuscript in Lines 220-222 and Lines 575-576.

      (7) How was IRF5 picked from GO analysis? was it within the 20 most significant genes?

      Sorry for the confusion. IRF5 was not identified through GO analysis. To determine the upstream transcriptional regulators, we used the ChEA3 database to predict potential upstream transcription factors based on all differentially expressed genes. The top 20 transcription factors were selected based on their scores. To further explore their relationship with atherosclerosis, these top 20 transcription factors were mapped to the atherosclerosis-related gene list in the DisGeNET database. IRF5 and IRF8 were the only two overlapping genes. To clarify this process, we have included a more detailed description of the IRF prediction approach in the revised manuscript (Lines 234–239).

      (8) Microscopic studies should be done more carefully? There seems to be a global expression present on the vascular wall for Aff3ir-ORF2 and the expression seems to be similar to AFF3 in Figure 1.

      We thank the reviewer for the valuable suggestion. We have followed your advice and provided the more representative images in Figure 1F.

      Reviewer #2 (Public review):

      Summary:

      The authors recently uncovered a novel nested gene, Aff3ir, and this work sets out to study its function in endothelial cells further. Based on differences in expression correlating with areas of altered shear stress, they investigate a role for the isoform Aff3ir-ORF2 in endothelial activation and development of atherosclerosis downstream of disturbed shear stress. Using a knockout mouse model and in vivo overexpression experiments, they demonstrate a strong potential for Aff3ir-ORF2 to alleviate atherosclerosis. They find that Aff3ir-ORF2 interacts with the pro-inflammatory transcription factor IRF5 and retains it in the cytoplasm, hence preventing upregulation of inflammation-associated genes. The data expands our knowledge of IRF5 regulation which could be relevant to researchers studying various inflammatory diseases as well as adding to our understanding of atherosclerosis development.

      Strengths:

      The in vivo data is solid using immunofluorescence staining to assess AFF3ir-ORF2 expression, a knockout mouse model, overexpression and knockdown studies, and rescue experiments in combination with two atherosclerotic models to demonstrate that Aff3ir-ORF2 can lessen atherosclerotic plaque formation in ApoE<sup>-/-</sup> mice.

      We thank you for your positive remarks.

      Weaknesses:

      While the in vivo data is generally convincing, a few data panels have issues and will need addressing. Also, the knockout mouse model will need to be described, since the paper referred to in the manuscript does not actually report any knockout mouse model. Hence it is unclear how Aff3ir-ORF2 is targeted, but Figure S2B shows that targeting is partial, since about 30% expression remains at the RNA level in MEFs isolated from the knockout mice.

      We thank you for the valuable comments. 

      First, we have followed your advice and included detailed information regarding the animal construction in the revised manuscript in Line 405-415. Additionally, the genotyping results have been included in new Figure S3A.

      Second, we acknowledge your concern about the knockout efficiency of ORF2 in mice. While the PCR assay indicated approximately 30% residual expression, our Western blot analysis of aorta samples demonstrated that ORF2 protein was barely detectable in knockout mice, as shown in new Figure S3B-C. Besides, our in vivo experiments using MEF from WT and AFF3ir-ORF2<sup>-/-</sup> mice (Figure 4I) further confirmed successful knockout. 

      Third, we have included a discussion addressing the discrepancies between PCR and Western blot results. In addition to technical differences between the two methods, the nature of AFF3ir-ORF2 may also contribute to these inconsistencies. The parent gene AFF3 is located in a genetically variable region and can be excised via intron 5 to form a replicable transposon, which translocates to other chromosomes and has been linked to leukemia (PMID: 34995897, 12203795, 12743608, and 17968322). AFF3ir is located in the intron 6, thus it exists in the transposon, which may complicate the measurement of its expression. Replicable transposons can exist as extrachromosomal elements, allowing them to be inherited across generations. We have included these discussion in the revised manuscript in Line 188-196.

      While the effect on atherosclerosis is clear, the conclusion that this is the result of reduced endothelial cell activation is not supported by the data. The mouse model is described as a global knockout and the shRNA knockdowns (Figure 5) and overexpression data in Figure 2 are not cell type-specific. Only the overexpression construct in Figure 6 uses an ICAM-2 promoter construct, which drives expression in endothelial cells, though leaky expression of this promoter has been reported in the literature. Therefore, other cell types such as smooth muscle cells or macrophages could be responsible for the effects observed.

      Thank you for your critical comment. To address your concern, we have made the following three revisions:

      First, we have analyzed the expression of AFF3ir-ORF2 in the vascular wall with or without intima in WT and AFF3ir-ORF2 knockout mice. As shown in Figure 1B and Figure S1A, while the expression of AFF3ir-ORF2 was notably downregulated in the aortic intima of athero-prone regions compared to the protective region, it remained largely unchanged in the aortic wall without intima across different regions of the aorta. This suggested that AFF3ir-ORF2 might play a predominant role in endothelial cells rather than other cell types in the context of shear stress.

      Second, we have used human endothelial cells (HUVECs) to further confirm our findings. As shown in Figure 2C and Figure S2B, we found that AFF3ir-ORF2 overexpression could attenuate disturbed shear stress-induced IRF5 nuclear translocation and the expression of inflammatory genes in HUVECs, suggesting the potential anti-inflammatory effects of AFF3ir-ORF2 in endothelial cells.

      Third, we agree with the reviewer’s comment that we cannot completely exclude the potential involvement of other cell types. Hence, we have included a limitation statement in the discussion part in Lines 341-344.

      The weakest part of the manuscript is the in vitro experiment using some nonidentifiable expression differences. The data is used to hypothesise on a role for IRF5 in the effects observed with Aff3ir-ORF2 knockout.

      Thank you for the comments. To address your concerns, we have made the following two changes:

      First, we have further investigated the functional features of the differential genes from the RNA-seq in the context of atherosclerotic microenvironment. The differential gene list was mapped onto the atherosclerosis-related gene dataset (PMID: 27374120), and a total of 363 genes overlapped. These 363 genes were subjected to bioinformatics enrichment analysis using Gene Ontology (GO) databases. GO analysis showed that these genes were mainly enriched in cell−cell adhesion and leukocyte activation involved in immune response, which aligns with the expression of VCAM-1 affected by AFF3ir-ORF2. The newly acquired data are presented in Figure S5B and the description of the results has been updated in the revised manuscript (Line 227-233).

      Second, we have further verified the RNA-seq results in vitro. Several classical inflammatory factors, including ICAM-1, CCL5, and CXCL10, which mRNA levels were significantly downregulated in RNA-seq and were also identified as target genes of IRF5, were analyzed. We found that AFF3ir-ORF2 deficiency aggravated, while AFF3ir-ORF2 overexpression attenuated, the expression of ICAM-1, CCL5, and CXCL10 induced by disturbed shear stress (New Figure S5D). Besides, the regulation of ICAM-1 by AFF3ir-ORF2 was confirmed at both protein and mRNA levels in HUVECs (Figure 2C-D and Figure S2B). 

      Overall, the paper succeeds in demonstrating a link between Aff3ir-ORF2 and atherosclerosis, but the cell types involved and mechanisms remain unclear. The study also shows a functional interaction between Aff3ir-ORF2 and IRF5 in embryonic fibroblasts, but any relevance of this mechanism for atherosclerosis or any cell types involved in the development of this disease remains largely speculative.

      Thank you for all the valuable comments. The specific responses have been provided above. Briefly, we have followed your advice and further confirmed the regulation of AFF3ir-ORF2 on IRF5 in endothelial cells. Besides, the RNA-seq results have been further analyzed, and partial results have been verified in endothelial cells to support the anti-inflammatory role of AFF3ir-ORF2. We greatly appreciate the reviewer’s insightful comments, which guided our revisions and contributed to significantly improving the paper.

      Reviewer #3 (Public review):

      This study is to demonstrate the role of Aff3ir-ORF2 in the atheroprone flow-induced EC dysfunction and ensuing atherosclerosis in mouse models. Overall, the data quality and comprehensiveness are convincing. In silico, in vitro, and in vivo experiments and several atherosclerosis were well executed. To strengthen further, the authors can address human EC relevance.

      We thank you for your positive remarks and insightful comments.

      Major comments:

      (1) The tissue source in Figures 1A and 1B should be clarified, the whole aortic segments or intima? If aortic segment was used, the authors should repeat the experiments using intima, due to the focus of the current study on the endothelium.

      We thank you for the suggestion. The tissue used in Figures 1A and 1B was from aortic intima. The description has been updated for clarity in the revised manuscript on Lines 114-125. 

      (2) Why were MEFs used exclusively in the in vitro experiments? Can the authors repeat some of the critical experiments in mouse or human ECs?

      Thank you for this insightful comment. Isolation and culture of mouse primary aortic ECs were notorious technically difficult and shear stress experiment require a large number of cells. Considering MEFs exhibit responses consistent with those of ECs, which has been delicately proved (PMID: 23754392), we used MEFs in our in vitro experiments.

      However, following your valuable advice, we have now employed human ECs (HUVECs) to confirm our findings. Consistent with our results in MEFs, we found that AFF3ir-ORF2 overexpression reduced the expression of inflammatory genes induced by disturbed shear stress at both protein and mRNA levels in HUVECs (Figure 2C, Figure S2B). Notably, despite the significant anti-inflammatory effects of AFF3irORF2, the sequence of this gene is not conserved in Homo sapiens and lacks an initiation codon, which is why we did not further proceed with the loss-of-function experiments.

      (3) The authors should explain why AFF3ir-ORF2 overexpression did not affect the basal level expression of ICAM-1, VCAM-1, IL-1b, and IL-6 under ST conditions (Figure 2A-C).

      We thank you for raising this critical question. Indeed, we found that AFF3ir-ORF2 overexpression did not affect the basal level of inflammatory genes under ST conditions, while it exerted anti-inflammatory effects under OSS conditions. One underlying reason might be the relative low level of expression of inflammatory genes under ST compared to OSS conditions. Additionally, as our findings suggested, AFF3ir-ORF2 exerted its anti-inflammatory role by binding to IRF5 and inhibiting IRF5 nuclear translocation. However, as shown in Figure 4I, IRF5 might be predominantly localized in the cytoplasm rather than the nucleus under ST conditions.

      We have included the description in the revised manuscript on Lines 157-163.

      (4) Please include data from sham controls, i.e., right carotid artery in Figure 2E.

      Thank you for the suggestion. We have followed your advice and included sham controls (staining of the right carotid arteries) in Figure S2E.

      (5) Given that the merit of the study lies in the effect of different flow patterns, the legion areas in AA and TA (Figure 3B, 3C) should be separately compared.

      We have followed your valuable suggestion and included the additional statistical results in Figure 3C in the revised manuscript.

      (6) For confirmatory purposes for the variations of IRF5 and IRF8, can the authors mine available RNA-seq or even scRNA-seq data on human or mouse atherosclerosis? This approach is important and could complement the current results that are lacking EC data.

      Thank you for your valuable suggestion. In the present study, we found that disturbed flow did not alter the protein level of IRF5 but promoted its nuclear translocation. Following your advice, we analyzed the expression of IRF5 in human ECs (GSE276195) and atherosclerotic mouse arteries (GSE222583) using public databases. Consistently, IRF5 did not show significant changes in mRNA levels under these conditions (Figure S5E-F), suggesting that the regulation of IRF5 in the context of disturbed flow or atherosclerosis is primarily post-translational.

      (7) With the efficacy of using AAV-ICAM2-AFF3ir-ORF2 in atherosclerosis reduction (Figure 6), the authors are encouraged to use lung ECs isolated from the AFF3ir-ORF2/-mice to recapitulate its regulation of IRF5.

      We greatly appreciate your valuable suggestion to use lung ECs from mice. We have observed that AFF3ir-ORF2 deficiency enhanced the nuclear translocation of IRF5 induced by OSS. Noteworthy, the transcriptional levels of IRF5 were minimally affected by AFF3ir-ORF2 deficiency. Hence, to recapitulate the regulation of IRF5 with lung ECs isolated from the AFF3ir-ORF2<sup>-/-</sup> mice, it would require treating lung ECs with OSS followed by isolation of subcellular components. However, both in vitro shear stress treatment and subcellular fraction isolation require a large number of cells, and mouse lung ECs are difficult to culture and pass through several passages. Therefore, we hope the reviewer understands that these experiments were not performed. As an alternative, we have confirmed the transcriptional activity changes of IRF5 due to AFF3ir-ORF2 manipulation by analyzing the expression of its target genes indicated from RNA-seq results in both the intima of mouse aorta (Figure S5C-D) and HUVECs (Figure 2C-D and Figure S2B). Our findings show that AFF3ir-ORF2 deficiency increases, while its overexpression decreases, the expression levels of IRF5-targeted genes in endothelial cells.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      Figure 2H - As I understand it, this is MFI measurement of VCAM. Please change accordingly.

      Thanks. Corrected.

      Reviewer #2 (Recommendations for the authors):

      My major concern is the use of MEFs for all in vitro experiments. All experiments should be done in endothelial cells if the aim is to show a mechanism relevant to endothelial activation and atherosclerosis. Lines 314-316 of the conclusion are absolutely not supported by the data.

      Thank you for the insightful comment. Following your advice, we have employed human ECs (HUVECs) to confirm our findings. Consistent with the findings in MEFs, we found that AFF3ir-ORF2 decreased the expression of inflammatory genes induced by disturbed shear stress, both at protein and mRNA levels in HUVECs (Figure 2C, Figure S2B). 

      Since the in vivo experiments are not cell type-specific, it would be important to test and compare the expression of Aff3ir-ORF2 in endothelial cells as well as smooth muscle and macrophages to support any claim of cell type involvement in the effects observed.

      We thank you for the valuable suggestion. In the revised manuscript, we have followed your suggestion and analyzed the expression pattern of AFF3ir-ORF2 in different regions of the aorta with or without endothelium. We observed a marked reduction in AFF3ir-ORF2 expression in the intima of the aortic arch compared to that in the intima of the thoracic aorta (Figure 1B-C). In contrast, the expression of AFF3irORF2 in the media and adventitia was comparable between the aortic arch and thoracic aorta (Figure S1A-B). These findings provide further evidence supporting the predominant role of endothelial cells. The description has been modified accordingly in the revised manuscript on Lines 121-134.

      The results of the RNA-seq experiment should be disclosed. The experiment should be deposited on GEO or similar and a table of differentially expressed genes added to the manuscript.

      Thank you for the suggestion. We have followed your advice and submitted the RNA-sequencing data to GEO datasets (GSE286206). Besides, a table of differentially expressed genes has been included in the revised manuscript as Table S3.

      Minor comments:

      (1) Figure 1A. Missing the labels of the target.

      Thanks. Corrected. 

      (2) Figure 1D. Cell alignment in AA compared to TA suggests that the image is of the outer curvature, but Figure 1F is showing that the outer curvature is expressing more ORF2 than the inner. Why was the outer curvature chosen for this panel and is it true to conclude on that assumption that expression of ORF2 compares as TA > Outer > Inner curvature?

      We thank you for the insightful suggestion. We have followed your advice and performed en-face immunofluorescence staining of AFF3ir-ORF2 and quantification of AFF3ir-ORF2 expression in AA inner, AA outer, and TA regions. As shown in new Figure 1D-E, the results indeed indicated that expression of AFF3irORF2 compares as TA > AA outer > AA inner.

      (3) Figure 2H. Target mislabelled as ICAM-1 instead of VCAM-.

      Thanks. Corrected. 

      (4) Figure S1A. VE-cad staining and cell shape differ between control and overexpression. Is this a phenotype or are different areas of the vasculature shown, which would make it hard to interpret since Aff3ir-ORF2 levels differ in different vessel areas?

      We thank the reviewer for raising this important question. For Figure S1A, only common carotid arteries were used for the staining. The potential differences in cell shape observed might be due to variations in the procedure during immunofluorescence staining. To avoid any misinterpretation, more representative images have been provided in the revised Figure S2C.

      (5) Figure 3D-G. Images are not representative of the quantification results.

      Thank you. More representative images have been replaced in the revised Figure 3D and Figure 3F.

      (6) Line 220. Data for IRF8 are not shown in the figure to support this claim.

      Thank you for pointing this out. The expression level of IRF8 has been included in Figure S5C.

      (7) Figure 6F. AAV-AFF3ir-ORF2 panel order inverted.

      Thanks. Corrected. 

      (8) Line 401. Type "hat" instead of "h at".

      Sorry for the typo. Corrected.

      Reviewer #3 (Recommendations for the authors):

      Minor comments:

      (1)  The rationale for the following sentence (lines 126-128) is lacking: "Moreover, 126 we observed the expression of AFF3ir-ORF2 in longitudinal sections of the mouse aorta (B. 127 Li et al., 2019)".

      Thanks. The rationale for these experiments have been included in the revised manuscript on Line 127-129. 

      (2) The source of antibodies against AFF3ir-ORF1 and AFF3ir-ORF2 used in western blot and immunostaining experiments were not mentioned in the manuscript.

      Thanks. The antibody information has been included in the method part on Line 456-457, 510-511. 

      (3) The rationale and data interpretation is not clear for the following sentence (lines 220-221): "In addition, neither IRF5 nor IRF8 expression was regulated by AFF3irORF2 220 (Figure 4F)".

      Thank you for pointing this out. The expression level of IRF8 has been included in Figure S5C. The sentence has been modified accordingly on Lines 253254. 

      (4) The quality of AFF3ir-ORF2 blot in Figure 4I needs improvement.

      Thanks. More representative images have been included in Figure 4I.

      (5) It appears that AFF3ir-ORF2 was present in both cytoplasm and nucleus. Does AFF3ir-ORF2 have a nuclear entry peptide? Also, the nuclear entry of AFF3ir-ORF2 can be enhanced by an immunofluorescence staining experiment.

      Thank you for your insightful comments. Indeed, although we did not observe any significant subcellular changes in the localization of AFF3ir-ORF2 under shear stress conditions, our immunostaining results revealed that AFF3ir-ORF2 is localized in both the cytoplasm and nucleus. To explore whether AFF3ir-ORF2 contains nuclear localization signals, we utilized the NLStradamus tool (http://www.moseslab.csb.utoronto.ca/NLStradamus/) to analyze its sequence. The predication indicated that AFF3ir-ORF2 lacks a nuclear localization signal.

    1. eLife Assessment

      Shihabeddin et al utilized single-cell RNA-Seq analysis of adult P23H zebrafish animals to identify transcription factors (e2fs, Prdm1a, Sp1) expressed selectively in neural progenitors and immature rods, and validated their necessity for regeneration using morphant analysis. The finding is useful, and the evidence is convincing. The deeper mechanistic analysis could further strengthen the current work by (1) distinguishing developmental vs regenerative transcriptional factors, (2) the addition of matched scATAC-Seq data, and (3) integration with single-cell multiome data from developing retina.

    2. Reviewer #1 (Public review):

      Summary:

      Shihabeddin et al. used bioinformatic and molecular biology tools to study the unique regeneration of rod photoreceptors in a zebrafish model. The authors identified a few transcription factors that seem to play an important role in this process.

      Strengths:

      This manuscript is well prepared. The topic of this study is an interesting and important one. Bioinformatics clues are interesting.

      Weaknesses:

      Considering the importance of the mechanism, the knockdown experiments require further validation. The authors over-emphasized this study's relevance to RP disease (i.e. patients and mammals are not capable of regeneration like zebrafish). They under-explained this regeneration's relevance or difference to normal developmental process, which is pretty much conserved in evolution.

    3. Reviewer #2 (Public review):

      This is an interesting and important work from Shihabeddin et al, to identify master regulators for rod photoreceptor regenerations in a zebrafish model of Retinitis Pigmentosa. Building on their scRNA-seq data, Shihabeddin et al dissected the progenitor cell types and performed trajectory analyses to predict transcription factors that apparently drive the progenitor proliferation and differentiation into rod photoreceptors. Their analyses predicted e2f1, e2f2, and e2f3 as critical drivers of progenitor proliferation, Prdm1a as a driver of rod photoreceptor differentiation, and SP1 as a driver of rod photoreceptor maturation. Genetic experiments provide clear support for the roles of e2fs in progenitor proliferation. It's also apparent from Figure 8 that prdm1 knockdown appears to cause a decrease in rhodopsin expression. By colocalizing BrdU and Retp1, the authors inferred that the apparent "new rods" (which exhibit mixed BrdU and Retp1 signal) are decreased with prdm1, providing further support. Overall I found the work to be interesting, rigorous, and informative for the community.

      I have a few suggestions for the authors to consider:

      (1) Perhaps the authors can consider explaining why the Prdm1a knock-down cells would have a higher Retp1 signal per cell in Fig 9B. Is this a representative picture? This appears to contradict Figure 8's conclusion, although I could tell that the number of Retp1+ cells in the ONL appears to be lower.

      (2) The authors noted "Surprisingly, the knockdown of prdm1a resulted in a significantly higher number of rhodopsin-positive cells in the INL (p=0.0293)", while it appears in Figure 9B, 9C that the difference is 2 cells vs 0 in a rightly broader field. It seems to be too strong of a statement for this effect.

      (3) It appears to this reviewer that the proteomic data didn't reveal much in line with the overall hypothesis or the mechanism, and it's unclear why the authors went for proteomics rather than bulk RNA-seq or ChIP-seq for a transcription factor knock-down experiment. Overall this is a minor point.

    4. Reviewer #3 (Public review):

      Summary:

      This study uses a combination of single-cell RNA-Seq to globally profile changes in gene expression in adult P23H transgenic zebrafish, which show progressive rod photoreceptor degeneration, along with age-matched controls. As expected, mitotically active retinal progenitors are identified in both conditions, the increased number of both progenitors and immature rods are observed. DrivAER-mediated gene regulatory network analysis in retinal progenitors, photoreceptor precursors, and mature rod photoreceptors respectively identified e2f1-3, prdm1a, and sp1 as top predicted transcriptional regulators of gene expression specific to these cell types. Finally, morpholino-mediated knockdown of these transcription factors led to expected defects in proliferation and rod differentiation.

      Strengths:

      Overall, this is a rigorous study that is convincingly executed and well-written. The data presented here will be a useful addition to existing single-cell RNA-Seq datasets obtained from regenerating zebrafish retina.

      Weaknesses:

      Multiple similar studies have been published and it is something of a missed opportunity in terms of identifying novel mechanisms of rod photoreceptor regeneration. Several other recent studies have used both single-cell RNA and ATAC-Seq to analyze gene regulatory networks that regulate neurogenesis in zebrafish retina following acute photoreceptor damage (Hoang, et al. 2020; Celloto, et al. 2023; Lyu, et al. 2023; Veen, et al 2023) or in other genetic models of progressive photoreceptor dystrophy such cep290 mutants (Fogerty, et al. 2022).

      The gene regulatory network analysis here would also benefit from the addition of matched scATAC-Seq data, which would allow the use of more powerful tools such as Scenic+ (Bravo and de Winter, et al. 2023). It would also benefit from integration with single-cell multiome data from developing retinas (Lyu, et al. 2023). The genes selected for functional analysis here are all either robustly expressed in retinal progenitor cells (ef1-3 and aurka) or in developing rods (prdm1a), so it is not really surprising that defects are observed. Identification of factors that selectively regulate rod photoreceptor regeneration, rather than those that regulate both development and regeneration, would provide additional novelty. This would also potentially allow the use of animal mutants for candidate genes, rather than exclusively relying on morphant analysis, which may have off-target effects.

      The description of the time points analyzed is vague, stating only that "fish from 6 to 12 months of age were analyzed". Since photoreceptor degeneration is progressive, it is unclear how progenitor behavior changes over time, or how the gene expression profile of other cell types such as microglia, cones, or surviving rods is altered by disease progression. Most similar studies address this by analyzing multiple time points from specific ages or times post-injury.

    5. Author response:

      Reviewer 1: “The authors over-emphasized this study's relevance to RP disease (i.e. patients and mammals are not capable of regeneration like zebrafish).”

      It is true that humans and other mammals are not capable of regeneration.  This is why we and many other groups study zebrafish to identify mechanisms of regeneration that successfully form new rods.  That said, our previous paper on the molecular basis or retinal remodeling in this zebrafish model system (Santhanam et al., 2023; Cell Mol Life Sci. 2023;80(12):362) revealed remarkable similarities in the stress and physiological responses of rods, cones, RPE and inner retinal neurons to those in mammalian RP models.  Thus, we believe this zebrafish is an adequate model of RP and an excellent model to study rod regeneration. 

      Reviewer 1: “They under-explained this regeneration's relevance or difference to normal developmental process, which is pretty much conserved in evolution.”  and:

      Reviewer 3: “It would also benefit from integration with single-cell multiome data from developing retinas (Lyu, et al. 2023).”

      It is an excellent suggestion to compare the regenerative response we have studied in a chronic degeneration/regeneration model to the trajectory of developmental rod formation. In Lyu, et at. 2023, it was found that while retinal regeneration has similarities to retinal development, it does not precisely recapitulate the same transcription factors and processes. Any differences between this trajectory and that revealed in developmental studies would be enlightening.  We intend to do such analyses to add to a revised manuscript in the future. 

      Reviewer 2: “Perhaps the authors can consider explaining why the Prdm1a knock-down cells would have a higher Retp1 signal per cell in Fig 9B. Is this a representative picture? This appears to contradict Figure 8's conclusion, although I could tell that the number of Retp1+ cells in the ONL appears to be lower.”

      These are different experimental paradigms.  Figure 8 shows knockdown 48 hours after injection, at which time prdm1a knockdown is affecting rhodopsin expression directly.  That experiment investigated whether prdm1a knockdown affected progenitor proliferation.  Figure 9 shows a time point 6 days after injection, at which time we were asking if prdm1a knockdown affected differentiation of progenitors into rods. 

      Reviewer 2: “The authors noted "Surprisingly, the knockdown of prdm1a resulted in a significantly higher number of rhodopsin-positive cells in the INL (p=0.0293)", while it appears in Figure 9B, 9C that the difference is 2 cells vs 0 in a rightly broader field. It seems to be too strong of a statement for this effect.”

      This was a very unexpected finding.  We included statistics (Figure 9D) to support the finding, so we don’t think it is too strong a statement to make.  Speculation as to what might cause this is fascinating.  Are Muller cells producing progenitors that fail to migrate to the ONL before differentiating into rods?  The lack of BrdU labeling does not support this idea.  Do neurogenic progenitor cells in the INL differentiate towards rods via a pathway that does not require prdm1a?  Perhaps.  Perhaps there are other explanations.

      Reviewer 2: “It appears to this reviewer that the proteomic data didn't reveal much in line with the overall hypothesis or the mechanism, and it's unclear why the authors went for proteomics rather than bulk RNA-seq or ChIP-seq for a transcription factor knock-down experiment. Overall this is a minor point.”

      We agree that bulk RNA sequencing would provide a similar answer, possibly with greater sensitivity.  We chose proteomics for two reasons: 1) We wanted an independent assessment of the knockdown effects that could evaluate whether the knockdowns worked and what pathways were affected.  Since our pathway comparison is to single cell RNAseq data, bulk RNA seq did not seem to be fully independent. 2) Because we used translation-blocking antisense oligos for most knockdown experiments, we did not expect the transcript abundance of the targeted gene to be affected, although these oligos can lead to target transcript degradation.  Thus, we were not likely to be able to validate that our knockdown worked with this technique. 

      Reviewer 3: “The gene regulatory network analysis here would also benefit from the addition of matched scATAC-Seq data, …”

      This is certainly true, and the reviewer points to several studies that have made excellent use of this strategy.  Given the 1-2 year timeline to obtain and analyze such data, it is unlikely that we will be able to incorporate such data in our revised manuscript, but we hope to do so for follow-up studies.

      Reviewer 3: “The description of the time points analyzed is vague, stating only that "fish from 6 to 12 months of age were analyzed". Since photoreceptor degeneration is progressive, it is unclear how progenitor behavior changes over time, or how the gene expression profile of other cell types such as microglia, cones, or surviving rods is altered by disease progression.”

      We have shown in a previous study (Santhanam et al. Cells. 2020;9(10)) that rod degeneration and regeneration are in a steady state from at least 4 to 8 months of age, and in other experiments in the lab at least to 12 months of age.  In this age range, regeneration keeps up with the pace of degeneration, both of which are very fast.  This encompasses the cell types that we specifically study in this manuscript.  The reviewer is right that other cell types could undergo changes.  This is a separate topic of study in the lab.

    1. eLife Assessment

      The authors provide valuable insights into the candidate upstream transcriptional regulatory factors that control the spatiotemporal expression of selector genes and their targets for GABAergic vs glutamatergic neuron fate in the anterior brainstem. The computational analysis of single-cell RNA-seq and single-cell ATAC-seq datasets to predict TF binding combined with cut and tag-seq to find TF binding represents a solid approach to support the findings in the study, although the display and discussion of the datasets could be strengthened. This study will be of interest to neurobiologists who study transcriptional mechanisms of neuronal differentiation.

    2. Reviewer #1 (Public review):

      Summary:

      The objective of this research is to understand how the expression of key selector transcription factors, Tal1, Gata2, Gata3, involved in GABAergic vs glutamatergic neuron fate from a single anterior hindbrain progenitor domain is transcriptionally controlled. With suitable scRNAseq, scATAC-seq, CUT&TAG, and footprinting datasets, the authors use an extensive set of computational approaches to identify putative regulatory elements and upstream transcription factors that may control selector TF expression. This data-rich study will be a valuable resource for future hypothesis testing, through perturbation approaches, of the many putative regulators identified in the study. The data are displayed in some of the main and supplemental figures in a way that makes it difficult to appreciate and understand the authors' presentation and interpretation of the data in the Results narrative. Primary images used for studying the timing and coexpression of putative upstream regulators, Insm1, E2f1, Ebf1, and Tead2 with Tal1 are difficult to interpret and do not convincingly support the authors' conclusions. There appears to be little overlap in the fluorescent labeling, and it is not clear whether the signals are located in the cell soma nucleus.

      Strengths:

      The main strength is that it is a data-rich compilation of putative upstream regulators of selector TFs that control GABAergic vs glutamatergic neuron fates in the brainstem. This resource now enables future perturbation-based hypothesis testing of the gene regulatory networks that help to build brain circuitry.

      Weaknesses:

      Some of the findings could be better displayed and discussed.

    3. Reviewer #2 (Public review):

      Summary:

      In the manuscript, the authors seek to discover putative gene regulatory interactions underlying the lineage bifurcation process of neural progenitor cells in the embryonic mouse anterior brainstem into GABAergic and glutamatergic neuronal subtypes. The authors analyze single-cell RNA-seq and single-cell ATAC-seq datasets derived from the ventral rhombomere 1 of embryonic mouse brainstems to annotate cell types and make predictions or where TFs bind upstream and downstream of the effector TFs using computational methods. They add data on the genomic distributions of some of the key transcription factors and layer these onto the single-cell data to get a sense of the transcriptional dynamics.

      Strengths:

      The authors use a well-defined fate decision point from brainstem progenitors that can make two very different kinds of neurons. They already know the key TFs for selecting the neuronal type from genetic studies, so they focus their gene regulatory analysis squarely on the mechanisms that are immediately upstream and downstream of these key factors. The authors use a combination of single-cell and bulk sequencing data, prediction and validation, and computation.

      Weaknesses:

      The study generates a lot of data about transcription factor binding sites, both predicted and validated, but the data are substantially descriptive. It remains challenging to understand how the integration of all these different TFs works together to switch terminal programs on and off.

    4. Author response:

      Reviewer #1 (Public review):

      Summary:

      The objective of this research is to understand how the expression of key selector transcription factors, Tal1, Gata2, Gata3, involved in GABAergic vs glutamatergic neuron fate from a single anterior hindbrain progenitor domain is transcriptionally controlled. With suitable scRNAseq, scATAC-seq, CUT&TAG, and footprinting datasets, the authors use an extensive set of computational approaches to identify putative regulatory elements and upstream transcription factors that may control selector TF expression. This data-rich study will be a valuable resource for future hypothesis testing, through perturbation approaches, of the many putative regulators identified in the study. The data are displayed in some of the main and supplemental figures in a way that makes it difficult to appreciate and understand the authors' presentation and interpretation of the data in the Results narrative. Primary images used for studying the timing and coexpression of putative upstream regulators, Insm1, E2f1, Ebf1, and Tead2 with Tal1 are difficult to interpret and do not convincingly support the authors' conclusions. There appears to be little overlap in the fluorescent labeling, and it is not clear whether the signals are located in the cell soma nucleus.

      Strengths:

      The main strength is that it is a data-rich compilation of putative upstream regulators of selector TFs that control GABAergic vs glutamatergic neuron fates in the brainstem. This resource now enables future perturbation-based hypothesis testing of the gene regulatory networks that help to build brain circuitry.

      We thank Reviewer #1 for the thoughtful assessment and recognition of the extensive datasets and computational approaches employed in our study. We appreciate the acknowledgment that our efforts in compiling data-rich resources for identifying putative regulators of key selector transcription factors (TFs)—Tal1, Gata2, and Gata3—are valuable for future hypothesis-driven research.

      Weaknesses:

      Some of the findings could be better displayed and discussed.

      We acknowledge the concerns raised regarding the clarity and interpretability of certain figures, particularly those related to expression analyses of candidate upstream regulators such as Insm1, E2f1, Ebf1, and Tead2 in relation to Tal1. We agree that clearer visualization and improved annotation of fluorescence signals are crucial to accurately support our conclusions. In our revised manuscript, we will enhance image clarity and clearly indicate sites of co-expression for Tal1 and its putative regulators, ensuring the results are more readily interpretable. Additionally, we will expand explanatory narratives within the figure legends to better align the figures with the results section.

      Reviewer #2 (Public review):

      Summary:

      In the manuscript, the authors seek to discover putative gene regulatory interactions underlying the lineage bifurcation process of neural progenitor cells in the embryonic mouse anterior brainstem into GABAergic and glutamatergic neuronal subtypes. The authors analyze single-cell RNA-seq and single-cell ATAC-seq datasets derived from the ventral rhombomere 1 of embryonic mouse brainstems to annotate cell types and make predictions or where TFs bind upstream and downstream of the effector TFs using computational methods. They add data on the genomic distributions of some of the key transcription factors and layer these onto the single-cell data to get a sense of the transcriptional dynamics.

      Strengths:

      The authors use a well-defined fate decision point from brainstem progenitors that can make two very different kinds of neurons. They already know the key TFs for selecting the neuronal type from genetic studies, so they focus their gene regulatory analysis squarely on the mechanisms that are immediately upstream and downstream of these key factors. The authors use a combination of single-cell and bulk sequencing data, prediction and validation, and computation.

      We also appreciate the thoughtful comments from Reviewer #2, highlighting the strengths of our approach in elucidating gene regulatory interactions that govern neuronal fate decisions in the embryonic mouse brainstem. We are pleased that our focus on a critical cell-fate decision point and the integration of diverse data modalities, combined with computational analyses, has been recognized as a key strength.

      Weaknesses:

      The study generates a lot of data about transcription factor binding sites, both predicted and validated, but the data are substantially descriptive. It remains challenging to understand how the integration of all these different TFs works together to switch terminal programs on and off.

      Reviewer #2 correctly points out that while our study provides extensive data on predicted and validated transcription factor binding sites, clearly illustrating how these factors collectively interact to regulate terminal neuronal differentiation programs remains challenging. We acknowledge the inherently descriptive nature of the current interpretation of our combined datasets.

      In our revision, we will clarify how the different data types support and corroborate one another, highlighting what we consider the most reliable observations of TF activity. Additionally, we will revise the discussion to address the challenges associated with interpreting the highly complex networks of interactions within the gene regulatory landscape.

      We sincerely thank both reviewers for their constructive feedback, which we believe will significantly enhance the quality and accessibility of our manuscript.

    1. eLife Assessment

      The study presents a valuable finding on the role of cholesterol-binding sites on GLP-1 receptors although the clinical ramifications are unclear and not eminent at this point. Based on the detailed and persuasive responses provided by authors to the concerns raised by reviewers, the revised manuscript is improved substantially and is convincing enough in its scientific merit. The study is a good addition to the scientific community working on receptor biology and drug development for GLP-1 R.

    2. Reviewer #1 (Public review):

      Summary:

      The authors demonstrate impairments induced by a high cholesterol diet on GLP-1R dependent glucoregulation in vivo as well as an improvement after reduction in cholesterol synthesis with simvastatin in pancreatic islets. They also map sites of cholesterol high occupancy and residence time on active versus inactive GLP-1Rs using coarse-grained molecular dynamics (cgMD) simulations, and screened for key residues selected from these sites and performed detailed analyses of the effects of mutating one of these residues, Val229, to alanine on GLP-1R interactions with cholesterol, plasma membrane behaviour, clustering, trafficking and signalling in pancreatic beta cells and primary islets, and describe an improved insulin secretion profile for the V229A mutant receptor.

      These are extensive and very impressive studies indeed. I am impressed with the tireless effort exerted to understand the details of molecular mechanisms involved in the effects of cholesterol for GLP-1 activation of its receptor. In general, the study is convincing, the manuscript well written and the data well presented. Some of the changes are small and insignificant which makes one wonder how important the observations are. For instance, in Figure 2E (which is difficult to interpret anyway because the data are presented in per cent, conveniently hiding the absolute results) does not show a significant result of the cyclodextrin except for insignificant increases in basal secretion. That is not identical to impairment of GLP-1 receptor signaling!

      To me the most important experiment of them all is the simvastatin experiment, but the results rest on very few numbers and there is a large variation. Apparently, in a previous study using more extensive reduction in cholesterol the opposite response was detected casting doubt on the significance of the current observation. I agree with the authors that the use of cyclodextrin may have been associated with other changes in plasma membrane structure than cholesterol depletion at the GLP-1 receptor. The entire discussion regarding the importance of cholesterol would benefit tremendously from studies of GLP-1 induced insulin secretion in people with different cholesterol levels before and after treatment with cholesterol-lowering agents. I suspect that such a study would not reveal major differences.

      Comments on revisions: The authors have responded well to my criticism.

    3. Reviewer #2 (Public review):

      Summary:

      In this manuscript the authors were providing a proof of concept that they can identify and mutate a cholesterol-binding site of a high-interest class B receptor, the GLP-1R, and functionally characterize the impact of this mutation on receptor behavior in the membrane and downstream signaling with the intent that similar methods can be useful to optimize small molecules that as ligands or allosteric modulators of GLP-1R can improve the therapeutic tools targeting this signaling system.

      Strengths:

      The majority of results on receptor behavior are elucidated in INS-1 cells expressing the wt or mutant GLP-1R, with one experiment translating the findings to primary mouse beta-cells. I think this paper lays a very strong foundation to characterize this mutation and does a good job discussing how complex cholesterol-receptor interactions can be (ie lower cholesterol binding to V229A GLP-1R, yet increased segregation to lipid rafts). Table 1 and Figure 9 are very beneficial to summarize the findings. The lower interaction with cholesterol and lower membrane diffusion in V229A GLP-1R resembles the reduced diffusion of wt GLP-1R with simv-induced cholesterol reductions, by presumably decreasing the cholesterol available to interact with wt GLP-1R. The effects of this mutation are not due to differences in Ex-4:recepotor affinity. I think this paper will be of interest to many physiologists who may not be familiar with many of the techniques used in this paper and the authors largely do a good job explaining the goals of using each method in the results section. While not necessary for this paper, a comparison of islet cholesterol content after this cholesterol diet vs the more typical 60% HFD used in obesity research would be beneficial for GLP-1 physiology research broadly to take these findings into consideration with model choice.

      Weaknesses:

      There are no obvious weaknesses in this manuscript and overall, I believe the authors achieved their aims and have demonstrated the importance of cholesterol interactions on GLP-1R functioning in beta-cells.

      Certainly many follow-up experiments are possible from these initial findings and of primary interest is how this mutation affects insulin homeostasis in vivo under different physiological conditions. One of the biggest pathologies in insulin homeostasis in obesity/t2d is an elevation of baseline insulin release (as modeled in Fig 1E) that renders the fold-change in glucose stimulated insulin levels lower and physiologically less effective. Future work by the authors may determine the effects of the GLP-1R V229A mutation on insulin secretion responses under diet-induced metabolic stress conditions. Furthermore, the authors may additionally investigate if V229A would have the same impact in a different cell type, especially in neurons, with implications in the regulation of satiation, gut motility, and especially nausea, which are of high translational interest.

      The comparison is drawn in the discussion between this mutation and ex4-phe1 to have biased agonism towards Gs over beta-arrestin signaling. Ex4-phe1 lowered pica behavior (a proxy for nausea) in the authors previously co-authored paper on ex4-phe1 (PMID 29686402) and drawing a parallel for this mutation or modification of cholesterol binding to potentially mitigate nausea is a novel direction.