26,869 Matching Annotations
  1. Jul 2024
    1. Reviewer #3 (Public Review):

      The authors used an open EEG dataset of observers viewing real-world objects. Each object had a real-world size value (from human rankings), a retinal size value (measured from each image), and a scene depth value (inferred from the above). The authors combined the EEG and object measurements with extant, pre-trained models (a deep convolutional neural network, a multimodal ANN, and Word2vec) to assess the time course of processing object size (retinal and real-world) and depth. They found that depth was processed first, followed by retinal size, and then real-world size. The depth time course roughly corresponded to the visual ANNs, while the real-world size time course roughly corresponded to the more semantic models.

      The time course result for the three object attributes is very clear and a novel contribution to the literature. However, the motivations for the ANNs could be better developed, the manuscript could better link to existing theories and literature, and the ANN analysis could be modernized. I have some suggestions for improving specific methods.

      (1) Manuscript motivations<br /> The authors motivate the paper in several places by asking " whether biological and artificial systems represent object real-world size". This seems odd for a couple of reasons. Firstly, the brain must represent real-world size somehow, given that we can reason about this question. Second, given the large behavioral and fMRI literature on the topic, combined with the growing ANN literature, this seems like a foregone conclusion and undermines the novelty of this contribution.

      While the introduction further promises to "also investigate possible mechanisms of object real-world size representations.", I was left wishing for more in this department. The authors report correlations between neural activity and object attributes, as well as between neural activity and ANNs. It would be nice to link the results to theories of object processing (e.g., a feedforward sweep, such as DiCarlo and colleagues have suggested, versus a reverse hierarchy, such as suggested by Hochstein, among others). What is semantic about real-world size, and where might this information come from? (Although you may have to expand beyond the posterior electrodes to do this analysis).

      Finally, several places in the manuscript tout the "novel computational approach". This seems odd because the computational framework and pipeline have been the most common approach in cognitive computational neuroscience in the past 5-10 years.

      (2) Suggestion: modernize the approach<br /> I was surprised that the computational models used in this manuscript were all 8-10 years old. Specifically, because there are now deep nets that more explicitly model the human brain (e.g., Cornet) as well as more sophisticated models of semantics (e.g., LLMs), I was left hoping that the authors had used more state-of-the-art models in the work. Moreover, the use of a single dCNN, a single multi-modal model, and a single word embedding model makes it difficult to generalize about visual, multimodal, and semantic features in general.

      (3) Methodological considerations<br /> a) Validity of the real-world size measurement<br /> I was concerned about a few aspects of the real-world size rankings. First, I am trying to understand why the scale goes from 100-519. This seems very arbitrary; please clarify. Second, are we to assume that this scale is linear? Is this appropriate when real-world object size is best expressed on a log scale? Third, the authors provide "sand" as an example of the smallest real-world object. This is tricky because sand is more "stuff" than "thing", so I imagine it leaves observers wondering whether the experimenter intends a grain of sand or a sandy scene region. What is the variability in real-world size ratings? Might the variability also provide additional insights in this experiment?<br /> b) This work has no noise ceiling to establish how strong the model fits are, relative to the intrinsic noise of the data. I strongly suggest that these are included.

    1. eLife assessment

      Some delayed rectifier currents in neurons are formed by the combination of Kv2 and silent subunits, KvS. However, we lack the tools to identify these heteromeric channels in vivo. In this valuable study by the Sack group, the authors identify a pharmacological tool that can reveal the presence of KvS subunits as components of the delayed rectifier potassium currents in selected neurons. The experimental evidence presented in the manuscript is compelling and represents an advance that should be of interest to a wide community of neuroscientists and channel physiologists.

    2. Reviewer #1 (Public Review):

      Summary:

      Kv2 subfamily potassium channels contribute to delayed rectifier currents in virtually all mammalian neurons and are encoded by two distinct types of subunits: Kv2 alpha subunits that have the capacity to form homomeric channels (Kv2.1 and Kv2.2), and KvS or silent subunits (Kv5,6,8.9) that can assemble with Kv2.1 or Kv2.2 to form heteromeric channels with novel biophysical properties. Many neurons express both types of subunits and therefore have the capacity to make both homomeric Kv2 channels and heteromeric Kv2/KvS channels. Determining the contributions of each of these channel types to native potassium currents has been very difficult because the differences in biophysical properties are modest and there are no Kv2/KvS-specific pharmacological tools. The authors set out to design a strategy to separate Kv2 and Kv2/KvS currents in native neurons based on their observation that Kv2/KvS channels have little sensitivity to the Kv2 pore blocker RY785 but are blocked by the Kv2 VSD blocker GxTx. They clearly demonstrate that Kv2/KvS currents can be differentiated from Kv2 currents in native neurons using a two-step strategy to first selectively block Kv2 with RY785, and then block both with GxTx. The manuscript is beautifully written; takes a very complex problem and strategy and breaks it down so both channel experts and the broad neuroscience community can understand it.

      Strengths:

      The compounds the authors use are highly selective and unlikely to have significant confounding cross-reactivity to other channel types. The authors provide strong evidence that all Kv2/KvS channels are resistant to RY785. This is a strength of the strategy - it can likely identify Kv2/KvS channels containing any of the 10 mammalian KvS subunits and thus be used as a general reagent on all types of neurons. The limitation then of course is that it can't differentiate the subtypes, but at this stage, the field really just needs to know how much Kv2/KvS channels contribute to native currents and this strategy provides a sound way to do so.

      Weaknesses:

      The authors are very clear about the limitations of their strategy, the most important of which is that they can't differentiate different subunit combinations of Kv2/KvS heteromers. This study is meant to be a start to understanding the roles of Kv2/KvS channels in vivo. As such, this is a minor weakness, far outweighed by the potential of the strategy to move the field through a roadblock that has existed since its inception.

      The study accomplishes exactly what it set out to do: provide a means to determine the relative contributions of homomeric Kv2 and heteromeric Kv2/KvS channels to native delayed rectifier K+ currents in neurons. It also does a fabulous job laying out the case for why this is important to do.

    3. Reviewer #2 (Public Review):

      Summary:

      Silent Kv subunits and the channels containing these Kv subunits (Kv2/KvS heteromers) are in the process of discovery. It is believed that these channels fine-tune the voltage-activated K+ currents that repolarize the membrane potential during action potentials, with a direct effect on cell excitability, mostly by determining action potentials firing frequency.

      Strengths:

      What makes silent Kv subunits even more important is that, by being expressed in specific tissues and cell types, different silent Kv subunits may have the ability to fine-tune the delayed rectifying voltage-activated K+ currents that are one of the currents that crucially determine cell excitability in these cells. The present manuscript introduces a pharmacological method to dissect the voltage-activated K+ currents mediated by Kv2/KvS heteromers as a means of starting to unveil their importance, together with Kv2-only channels, to the cells where they are expressed.

      Weaknesses:

      While the method is effective in quantifying these currents in any isolated cell under an electric voltage clamp, it is ineffective as a modulating maneuver to perhaps address these currents in an in vivo experimental setting. This is an important point but is not a claim made by the authors. There are other caveats with the methods and data:

      (i) The need for a 'cocktail' of blockers to supposedly isolate Kv2 homomers and Kv2/KvS heteromers' currents from others may introduce errors in the quantification Kv2/KvS heteromers-mediated K+ currents and that is due to possible blockers off targets.

      (ii) During the electrophysiology experiments, the authors use a holding potential that is not as negative as it is needed for the recording of the full population of the Kv2/KvS channels. Depolarized holding potentials lead to a certain level of inactivation of the channels, that vary according to the KvS involved/present in that specific population of channels. As a reminder, some KvS promote inactivation and others prevent inactivation. Therefore, the data must be interpreted as such.

      (iii) The analysis of conductance activation by using tail currents is only accurate when dealing with non-inactivating conductances. Also, in dealing with a heterogenous population of Kv2/KvS heteromers, heterogenous K+ conductance deactivation kinetics is a must. Indeed, different KvS may significantly relate to different deactivation kinetics as well.

      (iv) Silent Kv subunits may be retained in the ER, in heterologous systems like CHO cells. This aspect may subestimate their expression in these systems. Nevertheless, the authors show similar data in CHO cells and in primary neurons.

      (v) The hallmark of silent Kv subunits is their effect on the time inactivation of K+ currents. As such, data should be shown throughout, preferably, from this perspective, but it was only done so in Figure 4G.

      (vi) Functional characterization of currents only, as suggested by the authors as a bona fide of Kv2 and Kv2/KvS currents, should not be solely trusted to classify the currents and their channel mediators.

    1. eLife assessment

      This work represents a new toolkit for implementing virtual reality experiments in head-fixed animals. It is a valuable contribution to the field and the evidence for its utility and performance is solid. Some minor improvements in the material presented - including clarifying design decisions and providing more details about design features - would improve the readability and thereby potentially increase its impact.

    2. Reviewer #2 (Public Review):

      Summary:

      The authors present behaviorMate, an open-source behavior recording and control system including a central GUI and compatible treadmill and display components. Notably, the system utilizes the "Intranet of things" scheme and the components communicate through a local network, making the system modular, which in turn allows user to easily configure the setup to suit their experimental needs. Overall, behaviorMate is a valuable resource for researchers performing head-fixed imaging studies, as the commercial alternatives are often expensive and inflexible to modify.

      Strengths and Weaknesses:

      The manuscript presents two major utilities of behaviorMate: (1) as an open-source alternative to commercial behavior apparatus for head-fixed imaging studies, and (2) as a set of generic schema and communication protocols that allows the users to incorporate arbitrary recording and stimulation devices during a head-fixed imaging experiment. I found the first point well-supported and demonstrated in the manuscript. Indeed, the documentation, BOM, CAD files, circuit design, source, and compiled software, along with the manuscript, create an invaluable resource for neuroscience researchers looking to set up a budget-friendly VR and head-fixed imaging rig. Some features of behaviorMate, including the computer vision-based calibration of the treadmill, and the decentralized, Android-based display devices, are very innovative approaches and can be quite useful in practical settings. However, regarding the second point, my concern is that there is not adequate documentation and design flexibility to allow the users to incorporate arbitrary hardware into the system. In particular:

      (1) The central controlling logic is coupled with GUI and an event loop, without a documented plugin system. It's not clear whether arbitrary code can be executed together with the GUI, hence it's not clear how much the functionality of the GUI can be easily extended without substantial change to the source code of the GUI. For example, if the user wants to perform custom real-time analysis on the behavior data (potentially for closed-loop stimulation), it's not clear how to easily incorporate the analysis into the main GUI/control program.

      (2) The JSON messaging protocol lacks API documentation. It's not clear what the exact syntax is, supported key/value pairs, and expected response/behavior of the JSON messages. Hence, it's not clear how to develop new hardware that can communicate with the behaviorMate system.

      (3) It seems the existing control hardware and the JSON messaging only support GPIO/TTL types of input/output, which limits the applicability of the system to more complicated sensor/controller hardware. The authors mentioned that hardware like Arduino natively supports serial protocols like I2C or SPI, but it's not clear how they are handled and translated to JSON messages.

      Additionally, because it's unclear how easy to incorporate arbitrary hardware with behaviorMate, the "Intranet of things" approach seems to lose attraction. Since currently, the manuscript focuses mainly on a specific set of hardware designed for a specific type of experiment, it's not clear what are the advantages of implementing communication over a local network as opposed to the typical connections using USB.

      In summary, the manuscript presents a well-developed open-source system for head-fixed imaging experiments with innovative features. The project is a very valuable resource to the neuroscience community. However, some claims in the manuscript regarding the extensibility of the system and protocol may require further development and demonstration.

    3. Reviewer #3 (Public Review):

      In this work, the authors present an open-source system called behaviourMate for acquiring data related to animal behavior. The temporal alignment of recorded parameters across various devices is highlighted as crucial to avoid delays caused by electronics dependencies. This system not only addresses this issue but also offers an adaptable solution for VR setups. Given the significance of well-designed open-source platforms, this paper holds importance.

      Advantages of behaviorMate:

      The cost-effectiveness of the system provided.

      The reliability of PCBs compared to custom-made systems.

      Open-source nature for easy setup.

      Plug & Play feature requiring no coding experience for optimizing experiment performance (only text-based Json files, 'context List' required for editing).

      Points to clarify:

      While using UDP for data transmission can enhance speed, it is thought that it lacks reliability. Are there error-checking mechanisms in place to ensure reliable communication, given its criticality alongside speed?

      Considering this year's price policy changes in Unity, could this impact the system's operations?

      Also, does the Arduino offer sufficient precision for ephys recording, particularly with a 10ms check?

      Could you clarify the purpose of the Sync Pulse? In line 291, it suggests additional cues (potentially represented by the Sync Pulse) are needed to align the treadmill screens, which appear to be directed towards the Real-Time computer. Given that event alignment occurs in the GPIO, the connection of the Sync Pulse to the Real-Time Controller in Figure 1 seems confusing. Additionally, why is there a separate circuit for the treadmill that connects to the UI computer instead of the GPIO? It might be beneficial to elaborate on the rationale behind this decision in line 260. Moreover, should scenarios involving pupil and body camera recordings connect to the Analog input in the PCB or the real-time computer for optimal data handling and processing?

      Given that all references, as far as I can see, come from the same lab, are there other labs capable of implementing this system at a similar optimal level?

    1. eLife assessment

      In their manuscript, Cummings et al. use in vitro reconstitution to examine the differential activities of tubulin polyglycylases, providing valuable insights into the enzymatic regulation of microtubule glycylation and its mechanistic role in maintaining cilia function and microtubule dynamics. The convincing evidence, supported by well-designed experiments and appropriate controls, significantly advances our understanding of the tubulin code and its biochemical mechanisms.

    2. Reviewer #1 (Public Review):

      Summary:

      In their current study, Cummings et al have approached this fundamental biochemical problem using a combination of purified enzyme-substrate reactions, MS/MS, and microscopy in vitro to provide key insights into the hierarchy of generating polyglycylation in cilia and flagella. They first establish that TTLL8 is a monoglycylase, with the potential to add multiple mono glycine residues on both α- and β-tubulin. They then go on to establish that monoglycylation is essential for TTLL10 binding and catalytic activity, which progressively reduces as the level of polyglycylation increases. This provides an interesting mechanism of how the level of polyglycylation is regulated in the absence of a deglycylase. Finally, the authors also establish that for efficient TTLL10 activity, it is not just monoglycylation, but also polyglutamylation that is necessary, giving a key insight into how both these modifications interact with each other to ensure there is a balanced level of PTMs on the axonemes for efficient cilia function.

      Strengths:

      The manuscript is well-written, and experiments are succinctly planned and outlined. The experiments were used to provide the conclusions to what the authors were hypothesising and provide some new novel possible mechanistic insights into the whole process of regulation of tubulin glycylation in motile cilia.

      Weaknesses:

      The initial part of the manuscript where the authors discuss about the requirement of monoglycylation by TTLL8 is not new. This was established back in 2009 when Rogowski et al (2009) showed that polyglycylation of tubulin by TTLL10 occurs only when co-expressed in cells with TTLL3 or TTLL8. So, this part of the study adds very little new information to what was known.

      The study also fails to discuss the involvement of the other monoglycylase, TTLL3 in the entire study, which is a weakness as in vivo, in cells, both the monoglycylases act in concert and so, may play a role in regulating the activity of TTLL10.

    3. Reviewer #2 (Public Review):

      In their manuscript, Cummings et al. focus on the enzymatic activities of TTLL3, TTLL8, and TTLL10, which catalyze the glycylation of tubulin, a crucial posttranslational modification for cilia maintenance and motility. The experiments are beautifully performed, with meticulous attention to detail and the inclusion of appropriate controls, ensuring the reliability of the findings. The authors utilized in vitro reconstitution to demonstrate that TTLL8 functions exclusively as a glycyl initiase, adding monoglycines at multiple positions on both α- and β-tubulin tails. In contrast, TTLL10 acts solely as a tubulin glycyl elongase, extending existing glycine chains. A notable finding is the differential substrate recognition between TTLL glycylases and TTLL glutamylases, highlighting a broader substrate promiscuity in glycylases compared to the more selective glutamylases. This observation aligns with the greater diversification observed among glutamylases. The study reveals a hierarchical mechanism of enzyme recruitment to microtubules, where TTLL10 binding necessitates prior monoglycylation by TTLL8. This binding is progressively inhibited by increasing polyglycine chain length, suggesting a self-regulatory mechanism for polyglycine chain length control. Furthermore, TTLL10 recruitment is enhanced by TTLL6-mediated polyglutamylation, illustrating a complex interplay between different tubulin modifications. In addition, they uncover that polyglutamylation stimulates TTLL10 recruitment without necessarily increasing glycylation on the same tubulin dimer, due to the potential for TTLLs to interact with neighboring tubulin dimers. This mechanism could lead to an enrichment of glycylation on the same microtubule, contributing to the complexity of the tubulin code. The article also addresses a significant challenge in the field: the difficulty of generating microtubules with controlled posttranslational modifications for in vitro studies. By identifying the specific modification sites and the interplay between TTLL activities, the authors provide a valuable tool for creating differentially glycylated microtubules. This advancement will facilitate further studies on the effects of glycylation on microtubule-associated proteins and the broader implications of the tubulin code. In summary, this study substantially contributes to our knowledge of posttranslational enzymes and their regulation, offering new insights into the biochemical mechanisms underlying microtubule modifications. The rigorous experimental approach and the novel findings presented make this a pivotal addition to the field of cellular and molecular biology.

    1. Author Response:

      Thank you for the reviews and the eLife assessment. We want to take this opportunity to acknowledge the weaknesses pointed out by the reviewers and we will make small changes to the manuscript to account for these as part of the Version of Record.

      The tools are command-based and store outcomes locally

      We consider this to be an advantage of our ecosystem, which is intended for the case of individuals or small groups of authors. These features facilitate easy installation and integration with other tools. Further, our tool labelbuddy is a graphic user interface. Our tools may also be integrated into web-based systems as backends. Pubget is already being used in this way in the NeuroSynth Compose platform for semi-automated neuroimaging meta-analyses.

      pubget only gathers open-access papers from PubMed Central

      We recognize this as a limitation, and we acknowledge it in the original manuscript (in the discussion section, starting with "A limitation of Pubget is that it is restricted to the Open-Access subset of PMC"). We chose to limit the scope of our tools in order to ensure maintainability. Further, we are currently expanding pubget so it will also be able to access the abstracts and meta-data from closed-access papers indexed on PubMed. Future research could build other tools to work alongside pubget, to access other databases.

      Logic flow is difficult to follow

      We thank the reviewer for this feedback. Our paper describes an ecosystem of literature mining tools which does not lend itself to narrative flow nor does readily fit into the standard "Intro, Results, Discussion, Methods" structure that is typical in the scientific literature. We have done our best to conform to this expected format, but we have also provided detailed section and subsection headings to enable the reader to digest the paper nonlinearly. Each of the tools we describe also has detailed documentation on github that we update continuously.

      Results were not validated

      For the example where we automatically extracted participant demographics from papers, we validated the results on a held-out dataset of 100 manually-annotated papers. For the example with automatic large-scale meta-analyses (neuroquery and neurosynth), these methods are described together with their validation in the original papers. If this ecosystem of tools is integrated into other workflows, it should be validated in those contexts. We recognize that validating meta-analyses is a difficult problem because we do not have ground truth maps of the brain.

      Efficiency was not quantified

      Creators of tools do not always do experiments to quantify their efficiency and other qualities. We have chosen not to do this here, first because it is outside the scope of this paper as it would necessitate to specify very precise tasks and how efficiency is measured, and second because at least for the data collection part, the benefit of using an automated tool over manually downloading papers one by one is clear even without quantifying it. Compared to the approach of re-using existing datasets, our ecosystem is not necessarily more or less efficient. But it has other advantages, such as providing datasets that contain the latest literature, whereas the existing datasets are static and quickly out-of-date.

      We do not highlight the strength of AI functions

      We provide an example of using our tools to gather data and manually annotate a validation set for use with large language models (in our case, GPT). We are further exploring this domain in other projects; for example, for performing semi-automated meta-analyses using the NeuroSynth Compose platform. However, we did not deem it necessary to include more AI examples in the current paper; we only wanted to provide enough examples to demonstrate the scope of possible use cases of our ecosystem.

      We thank the reviewers for their time and valuable feedback, which we will keep in mind in our future research.

    1. eLife assessment

      This important work substantially advances our understanding of RNA structure analysis by introducing an innovative method that extends DMS probing to include guanosine residues, thereby enhancing our ability to detect complex tertiary interactions. The evidence supporting the conclusions is compelling, with detailed analyses demonstrating the method's capacity to differentiate structural contexts and improve RNA structure predictions. This work will be of broad interest to RNA structural biology, biochemistry, and biophysics researchers.

    2. Reviewer #1 (Public Review):

      Summary:

      DMS-MaP is a sequencing-based method for assessing RNA folding by detecting methyl adducts on unpaired A and C residues created by treatment with dimethylsulfate (DMS). DMS also creates methyl adducts on the N7 position of G, which could be sensitive to tertiary interactions with that atom, but N7-methyl adducts cannot be detected directly by sequencing. In this work, the authors adopt a previously developed method for converting N7-methyl-G to an abasic site to make it detectable by sequencing and then show that the ability of DMS to form an N7-methyl-G adduct is sensitive to RNA structural context. In particular, they look at the G-quadruplex structure motif, which is dense with N7-G interactions, is biologically important, and lacks conclusive methods for in-cell structural analysis.

      Strengths:

      - The authors clearly show that established methods for detecting N7-methyl-G adducts can be used to detect those adducts from DMS and that the formation of those adducts is sensitive to structural context, particularly G-quadruplexes.

      - The authors assess the N7-methyl-G signal through a wide range of useful probing analyses, including standard folding, adduct correlations, mutate-and-map, and single-read clustering.

      - The authors show encouraging preliminary results toward the detection of G-quadruplexes in cells using their method. Reliable detection of RNA G-quadruplexes in cells is a major limitation for the field and this result could lead to a significant advance.

      - Overall, the work shows convincingly that N7-methyl-G adducts from DMS provide valuable structural information and that established data analyses can be adapted to incorporate the information.

      Weaknesses:

      - Most of the validation work is done on the spinach aptamer and it is the only RNA tested that has a known 3D structure. Although it is a useful model for validating this method, it does not provide a comprehensive view of what results to expect across varied RNA structures.

      - It's not clear from this work what the predictive power of BASH-MaP would be when trying to identify G-quadruplexes in RNA sequences of unknown structure. Although clusters of G's with low reactivity and correlated mutations seem to be a strong signal for G-quadruplexes, no effort was made to test a range of G-rich sequences that are known to form G-quadruplexes or not. Having this information would be critical for assessing the ability of BASH-MaP to identify G-quadruplexes in cells.

      - Although the authors present interesting results from various types of analysis, they do not appear to have developed a mature analysis pipeline for the community to use. I would be inclined to develop my own pipeline if I were to use this method.

      - There are various aspects of the DAGGER analysis that don't make sense to me:<br /> (1) Folding of the RNA based on individual reads does not represent single-molecule folding since each read contains only a small fraction of the possible adducts that could have formed on that molecule. As a result, each fold will largely be driven by the naive folding algorithm. I recommend a method like DREEM that clusters reads into profiles representing different conformations.<br /> (2) How reliable is it to force open clusters of low-reactivity G's across RNA's that don't already have known G-quadruplexes?<br /> (3) By forcing a G-quadruplex open it will be treated as a loop by the folding algorithm, so the energetics won't be accurate.<br /> (4) It's not clear how signals on "normal" G's are treated. In Figure 5C some are wiped to 0 but others are kept as 1.

    3. Reviewer #2 (Public Review):

      Summary:

      The manuscript introduces BASH MaP and DAGGER, innovative tools for analyzing RNA tertiary structures, specifically focusing on the G-quadruplexes. Traditional methods have struggled to detect and analyze these structures due to their reliance on interactions on the Hoogsteen face of guanine, which are not readily observable through conventional probing that targets Watson-Crick interactions. BASH MaP employs dimethyl sulfate and potassium borohydride to enhance the detection of N7-methylguanosine by converting it into an abasic site, thereby enabling its identification through misincorporation during reverse transcription. This method provides higher precision in identifying G-quadruplexes and offers deeper insights into RNA's structural dynamics and alternative conformations in both vitro and cellular contexts. Overall, the study is well-executed, demonstrating robust signal detection of N7-Gs with some compelling positive controls, thorough analysis, and beautifully presented figures.

      Strengths:

      The manuscript introduces a new method to detect G-quadruplexes (G-qs) that simplifies and potentially enhances the robustness and quantification compared to previous methods relying on reverse transcription truncations. The authors provide a strong positive control, demonstrating a 70% misincorporation at endogenous N7-G within the 18S rRNA, which illustrates BASH MaP's high signal-to-noise ratio. The data concerning the detection of positive control G-qs is particularly compelling.

      Weaknesses:

      Figure 3E shows considerable variability in the correlations among guanosines, suggesting that the methods may struggle with specificity in determining guanosine participation within and between different quadruplexes. There is no estimation of the methods false positive discovery rate.

    4. Reviewer #3 (Public Review):

      Summary:

      In this study, the authors aim to develop an experimental/computational pipeline to assess the modification status of an RNA following treatment with dimethylsulfate (DMS). Building upon the more common DMS Map method, which predominantly assesses the modification status of the Watson-Crick-Franklin face of A's and C's, the authors insert a chemical processing step in the workflow prior to deep sequencing that enables detection of methylation at the N7 position of guanosine residues. This approach, termed BASH MaP, provides a more complete assessment of the true modification status of an RNA following DMS treatment and this new information provides a powerful set of constraints for assessing the secondary structure and conformational state of an RNA. In developing this work, the authors use Spinach as a model RNA. Spinach is a fluorogenic RNA that binds and activates the fluorescence of a small molecule ligand. Crystal structures of this RNA with ligand bound show that it contains a G-quadruplex motif. In applying BASH MaP to Spinach, the authors also perform the more standard DMS MaP for comparison. They show that the BASH MaP workflow appears to retain the information yielded by DMS MaP while providing new information about guanosine modifications. In Spinach, the G-quadruplex G's have the least reactive N7 positions, consistent with the engagement of N7 in hydrogen bonding interactions at G's involved in quadruplex formation. Moreover, because the inclusion of data corresponding to G increases the number of misincorporations per transcript, BASH MaP is more amenable to analysis of co-occurring misincorporations through statistical analysis, especially in combination with site-specific mutations. These co-occurring misincorporations provide information regarding what nucleotides are structurally coupled within an RNA conformation. By deploying a likelihood-ratio statistical test on BASH MaP data, the authors can identify Gs in G-quadruplexes, deconvolute G-G correlation networks, base-triple interactions and even stacking interactions. Further, the authors develop a pipeline to use the BASH MaP-derived G-modification data to assist in the prediction of RNA secondary structure and identify alternative conformations adopted by a particular RNA. This seems to help with the prediction of secondary structure for Spinach RNA.

      Strengths:

      The BASH Map procedure and downstream data analysis pipeline more fully identify the complement of methylations to be identified from the DMS treatment of RNA, thereby enriching the information content. This in turn allows for more robust computational/statistical analysis, which likely will lead to more accurate structure predictions. This seems to be the case for the Spinach RNA.

      Weaknesses:

      The authors demonstrate that their method can detect G-quadruplexes in Spinach and some other RNAs both in vitro and in cells. However, the performance of BASH MaP and associated computational analysis in the context of other RNAs remains to be determined.

    1. eLife assessment

      This valuable study combines evolution experiments with molecular and genetic techniques to study how a genetic lesion in MreB that causes rod-shape cells to become spherical, with concomitant deleterious fitness effects, can be rescued by natural selection. The results are convincing, although the statistical analyses and figure presentation could be improved, and the concrete contribution of the paper and how it relates to previous literature clarified.

    2. Reviewer #1 (Public Review):

      Summary:

      The authors performed experimental evolution of MreB mutants that have a slow-growing round phenotype and studied the subsequent evolutionary trajectory using analysis tools from molecular biology. It was remarkable and interesting that they found that the original phenotype was not restored (most common in these studies) but that the round phenotype was maintained.

      Strengths:

      The finding that the round phenotype was maintained during evolution rather than that the original phenotype, rod-shaped cells, was recovered is interesting. The paper extensively investigates what happens during adaptation with various different techniques. Also, the extensive discussion of the findings at the end of the paper is well thought through and insightful.

      Weaknesses:<br /> I find there are three general weaknesses:

      (1) Although the paper states in the abstract that it emphasizes "new knowledge to be gained" it remains unclear what this concretely is. On page 4 they state 3 three research questions, these could be more extensively discussed in the abstract. Also, these questions read more like genetics questions while the paper is a lot about cell biological findings.

      (2) it is not clear to me from the text what we already know about the restoration of MreB loss from suppressors studies (in the literature). Are there suppressor screens in the literature and which part of the findings is consistent with suppressor screens and which parts are new knowledge?

      (3) The clarity of the figures, captions, and data quantification need to be improved.

    3. Reviewer #2 (Public Review):

      Yulo et al. show that deletion of MreB causes reduced fitness in P. fluorescens SBW25 and that this reduction in fitness may be primarily caused by alterations in cell volume. To understand the effect of cell volume on proliferation, they performed an evolution experiment through which they predominantly obtained mutations in pbp1A that decreased cell volume and increased viability. Furthermore, they provide evidence to propose that the pbp1A mutants may have decreased PG cross-linking which might have helped in restoring the fitness by rectifying the disorganised PG synthesis caused by the absence of MreB. Overall this is an interesting study.

      Queries:

      Do the small cells of mreB null background indeed have have no DNA? It is not apparent from the DAPI images presented in Supplementary Figure 17. A more detailed analysis will help to support this claim.

      What happens to viability and cell morphology when pbp1A is removed in the mreB null background? If it is actually a decrease in pbp1A activity that leads to the rescue, then pbp1A- mreB- cells should have better viability, reduced cell volume and organised PG synthesis. Especially as the PG cross-linking is almost at the same level as the T362 or D484 mutant.

      What is the status of PG cross-linking in ΔmreB Δpflu4921-4925 (Line 7)?

      What is the morphology of the cells in Line 2 and Line 5? It may be interesting to see if PG cross-linking and cell wall synthesis is also altered in the cells from these lines.

      The data presented in 4B should be quantified with appropriate input controls.

      What are the statistical analyses used in 4A and what is the significance value?

      A more rigorous statistical analysis indicating the number of replicates should be done throughout.

    4. Reviewer #3 (Public Review):

      This paper addresses an understudied problem in microbiology: the evolution of bacterial cell shape. Bacterial cells can take a range of forms, among the most common being rods and spheres. The consensus view is that rods are the ancestral form and spheres the derived form. The molecular machinery governing these different shapes is fairly well understood but the evolutionary drivers responsible for the transition between rods and spheres are not. Enter Yulo et al.'s work. The authors start by noting that deletion of a highly conserved gene called MreB in the Gram-negative bacterium Pseudomonas fluorescens reduces fitness but does not kill the cell (as happens in other species like E. coli and B. subtilis) and causes cells to become spherical rather than their normal rod shape. They then ask whether evolution for 1000 generations restores the rod shape of these cells when propagated in a rich, benign medium.

      The answer is no. The evolved lineages recovered fitness by the end of the experiment, growing just as well as the unevolved rod-shaped ancestor, but remained spherical. The authors provide an impressively detailed investigation of the genetic and molecular changes that evolved. Their leading results are:

      (1) The loss of fitness associated with MreB deletion causes high variation in cell volume among sibling cells after cell division.

      (2) Fitness recovery is largely driven by a single, loss-of-function point mutation that evolves within the first ~250 generations that reduces the variability in cell volume among siblings.

      (3) The main route to restoring fitness and reducing variability involves loss of function mutations causing a reduction of TPase and peptidoglycan cross-linking, leading to a disorganized cell wall architecture characteristic of spherical cells.

      The inferences made in this paper are on the whole well supported by the data. The authors provide a uniquely comprehensive account of how a key genetic change leads to gains in fitness and the spectrum of phenotypes that are impacted and provide insight into the molecular mechanisms underlying models of cell shape.

      Suggested improvements and clarifications include:

      (1) A schematic of the molecular interactions governing cell wall formation could be useful in the introduction to help orient readers less familiar with the current state of knowledge and key molecular players.

      (2) More detail on the bioinformatics approaches to assembling genomes and identifying the key compensatory mutations are needed, particularly in the methods section. This whole subject remains something of an art, with many different tools used. Specifying these tools, and the parameter settings used, will improve transparency and reproducibility, should it be needed.

      (3) Corrections for multiple comparisons should be used and reported whenever more than one construct or strain is compared to the common ancestor, as in Supplementary Figure 19A (relative PG density of different constructs versus the SBW25 ancestor).

      (4) The authors refrain from making strong claims about the nature of selection on cell shape, perhaps because their main interest is the molecular mechanisms responsible. However, I think more can be said on the evolutionary side, along two lines. First, they have good evidence that cell volume is a trait under strong stabilizing selection, with cells of intermediate volume having the highest fitness. This is notable because there are rather few examples of stabilizing selection where the underlying mechanisms responsible are so well characterized. Second, this paper succeeds in providing an explanation for how spherical cells can readily evolve from a rod-shaped ancestor but leaves open how rods evolved in the first place. Can the authors speculate as to how the complex, coordinated system leading to rods first evolved? Or why not all cells have lost rod shape and become spherical, if it is so easy to achieve? These are important evolutionary questions that remain unaddressed. The manuscript could be improved by at least flagging these as unanswered questions deserving of further attention.

      The value of this paper stems both from the insight it provides on the underlying molecular model for cell shape and from what it reveals about some key features of the evolutionary process. The paper, as it currently stands, provides more on which to chew for the molecular side than the evolutionary side. It provides valuable insights into the molecular architecture of how cells grow and what governs their shape. The evolutionary phenomena emphasized by the authors - the importance of loss-of-function mutations in driving rapid compensatory fitness gains and that multiple genetic and molecular routes to high fitness are often available, even in the relatively short time frame of a few hundred generations - are well-understood phenomena and so arguably of less broad interest. The more compelling evolutionary questions concern the nature and cause of stabilizing selection (in this case cell volume) and the evolution of complexity. The paper misses an opportunity to highlight the former and, while claiming to shed light on the latter, provides rather little useful insight.

    5. Author response:

      Thank you for handling our paper and our thanks to the reviewers for their engagement, comments and valuable suggestions. We will take the opportunity to provide a full response and submit a revised version in the coming weeks.

    1. Joint Public Review:

      An outside expert evaluated your responses to the original reviewers and offered the following comments:

      The main criticism was whether deleterious variants were appropriately classified in the work. The authors use two different methods to characterize the effect of alleles to satisfy these comments. The result is somewhat complex. The authors do replicate the effect of dominance on fixation and segregation of deleterious alleles by classifying polymorphisms as synonymous or synonymous with SNPeff. This is not entirely surprising as it is approximately equivalent to classifying based on fold degeneracy (but it includes sites that have other than 0 or 4 fold degeneracy). However, the authors do not mention in the text that their observation of increased segregating deleterious mutations in recessive alleles was only statistically significant in A. halleri (for both analyses). Using SIFT, the authors only find an effect of dominance in A. lyrata. So in reality, while the trends are the same across the analyses, the statistical significance of the effects of dominance was not consistent.

      Reviewer 2 had several more detailed criticisms of the manuscript. The first was that the authors should explore the dominance of linked deleterious mutations themselves. I agree that this would be interesting, but it is very difficult to accomplish, and I agree with the author's reluctance to do much more here. The reviewer also criticized the authors simulation approach. The authors provided their simulation script as requested, but declined to do additional simulations under varied selection coefficients. I felt this was a minimally adequate response to the reviewers concerns, but the authors could have reasonably conducted a few additional simulations under varied selection coefficients.

      I think that the scope of the findings described in the assessment was reasonable. This is interesting work, but despite the author's arguments, the system is somewhat unique if for no other reason than that balancing selection at S-loci is uniquely strong

    2. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      The paper combines phenotypic and genomic analyses of the "sheltered load" (i.e. the accumulation of deleterious mutations linked to S-loci that are hidden from selection in the homozygous state) in Arabidopsis. The authors compare results to previous theoretical predictions concerning the extent of the load in dominant vs recessive S-alleles, and further develop exciting theory to reconcile differences between previous theory and observed results.

      Strengths:

      This is a very nice combination of theory and data to address a classical question in the field.

      We thank the reviewer for this positive feedback.

      Weaknesses:

      The "genetic load" is a poorly defined concept in general, and its quantification via the number of putatively deleterious mutations is quite difficult. Furthermore counting up the number of derived mutations at fully constrained nucleotides may not be a great estimate of the load, and certainly does not allow for evaluation of recessivity -- a concept critical to ideas concerning the sheltered load. Alternative approaches - including estimating the severity of mutations - could be helpful as well. This imperfection in available approaches to test theory must be acknowledged more strongly by the authors.

      As suggested by the reviewer, we implemented alternative approaches to estimate the severity of deleterious mutations and now report the results of SNPeff and

      SIFT4G analyses in Table S6. The results we obtained with these other metrics were overall very similar to those based on our previous counting of mutations at 0-fold and 4-fold degenerate sites. More generally, we tried to improve the presentation of our strategy to estimate the genetic load (clarified in lines 262-268, 271, 292-295, 297. In particular, we made it clear that our population genetic analysis cannot assess the recessivity of the observed mutations (lines 428-434).

      Reviewer #2 (Public Review):

      Summary:

      This study looks into the complex dominance patterns of S-allele incompatibilities in Brassicaceae, through which it attempts to learn more about the sheltering of deleterious load. I found several weak points in the analyses that diminished my excitement about the results. In particular, the way in which deleterious mutations were classified lacked the ability to distinguish the severity of the mutations and thus their expected associated dominance.

      First, we would like to clarify that our goal with this study is NOT to learn something about dominance of the linked deleterious mutations (we can not). Instead, we compare the accumulation of deleterious mutations linked to dominant vs recessive S-ALLELES, but are agnostic regarding the dominance level of the LINKED mutations themselves. The rationale is that the different intensities of natural selection between dominant vs recessive S-alleles provide a powerful way to examine the process by which deleterious mutations are sheltered in general. We further clarified this aspect on lines 70-73 and 399-401.

      Second, as mentioned above in response to Reviewer 1, we complemented the analysis by predicting the severity of the deleterious mutations by SIFT4G and SNPeff. The results were largely consistent, with the exception that the number of sites included in SIFT4G was low, such that the statistical power was reduced (lines 296-300).

      Furthermore, the simulation approach could have provided this exact sort of insight but was not designed to do so, making this comparison to the empirical data also less than exciting for me.

      As explained above, studying dominance of the linked mutations we observed is an interesting research question (albeit a difficult one), but it was not our goal here. Instead, our study was designed as an empirical test of the predictions presented in Llaurens et al (2009), and we re-analysed some aspects of the model outcome to illustrate our points.

      We now better explain that we based our choice of parameters on the fact that in the theoretical study by Llaurens et al (2009), recessive deleterious mutations are predicted to accumulate in a much more straightforward manner (line 316-318).

      We now dedicate a paragraph of the discussion to explain how our stochastic simulations could be improved, and acknowledge that a full exploration of the interaction between dominance of the S-alleles and dominance of the linked deleterious mutations would be an interesting follow-up - albeit beyond the scope of our study (line 437-441).

      Major and minor comments:

      I think the introduction (or somewhere before we dive into it in the results) of the dominance hierarchy for the S-alleles needs a more in-depth explanation. Not being familiar with this beforehand really made this paper inaccessible to me until I then went to find out more before continuing. I would expect this paper to be broad enough that self-contained information makes it accessible to all readers. For example, lines 110-115 could be in the Introduction.

      We thank the reviewer for this useful remark. We now give a more comprehensive description of the dominance hierarchy and introduce the classes of dominance in A. lyrata already in the introduction, on lines 64-70.

      Along with my above comment, perhaps it is not my place to comment, but I find the paper not of a broad enough scope to be of interest to a broad readership. This S-allele dominance system is more than simple balancing selection, it is a very complex and specific form of dominance between several haplotypes, and the mechanism of dominance does not seem to be genetic. I am not sure that it thus extrapolates to broad comments on general dominance and balancing selection, e.g. it would not be the same as considering inversions and this form of balancing selection where we also expect recessive deleterious mutations to accumulate.

      We disagree with these interpretations by the reviewer, for two reasons:

      First, the mechanism of dominance is actually entirely genetic. In fact, we uncovered some years ago that it is based on the molecular interaction between small non-coding RNAs from dominant alleles and their target sites on recessive alleles (Durand et al. Science 2014, see lines 68-70). If there is something specific with this system, it is that the dominance phenomenon is better understood at the mechanistic level than in most other cases, but the resulting phenomenon in itself (a dominance hierarchy) is rather common.

      Second, the kind of variation in the intensity of linked selection created by this mechanism is actually a general phenomenon, so our results have broad relevance beyond our particular study system. We modified the introduction to explain this point

      more clearly, highlighting in particular the fact that the situation we study closely resembles the case of sex chromosomes, where X (or Z) chromosomes are genetically recessive and Y (or W) chromosomes are genetically dominant. We cite this example in lines 83-87 of the introduction and also several well-studied other examples on lines 480-489 of the discussion.

      It would have been particularly interesting, or a nice addition, to see deleterious mutations classed by something like SNPeff or GERP where you can have different classes of moderate to severe deleterious variants, which we would expect also to be more recessive the more deleterious they are. In line with my next comment on the simulations, I think relative differences between mutations expected to be more or less dominant may be even more insightful into the process of sheltering which may or may not be going on here.

      We agree with the reviewer, and as detailed above we have now integrated such analyses with SNPeff and SIFT4G (Table S6). These new results reinforce our conclusion that while S-allele dominance influences the fixation of deleterious mutations, it has no effect on their total number. See lines 270-272 and 296-300.

      In the simulations, h=0 and s=0.01 (as in Figure 5) for all deleterious mutations seems overly simplistic, and at the convenient end for realistic dominance. I think besides recessive lethals which we expect to be close to h=0 would have a much larger selection coefficient, and other deleterious mutations would only be partially recessive at such an s value. I expect this would change some of the simulation results seen, though to what degree I am not certain. It would be nice to at least check the same exact results for h=0.3 or 0.2 (or additionally also for recessive lethals, e.g. h=0 and s=-0.9). I would also disagree with the statement in line 677, many studies have shown, particularly those on balancing selection, that partially recessive deleterious mutations are not eliminated by natural selection and do play a role in population genetic dynamics. I am also not surprised that extinction was found for higher s values when the mutation rate for such mutations was very high and the distribution of s values was constant. An influx of such highly deleterious mutations is unlikely to ever let a population survive, yet that does NOT mean that in nature, the rare influx of such mutations does lead to them being sheltered. I find overall that the simulation results contribute very little, to none, to this paper, as without something more realistic, like a simultaneous distribution of s and h values, you cannot say which, if any class of these mutations are the ones expected to accumulate because of S-allele dominance.

      We understand that the previous version of our manuscript was confusing between dominance of the S-alleles and dominance of the linked deleterious mutations. We clarified that our study focuses on the effect of the former only (lines 99, 263-264 and 581-583).

      We agree that a complete exploration of the interaction between dominance of the S-alleles and dominance of the linked mutations being sheltered would have been an asset, but as explained above this is not the focus of our study. The previous work by Llaurens et al (2009) has already established that deleterious mutations can fix within S-allele lineages, especially when linked to dominant S-alleles, and when the number of S-alleles is large. Under the conditions they examined, deleterious mutations were much more strongly eliminated if not fully recessive (h=0 vs h=0.2), so for the present study we decided to simulate fully recessive mutations only. We now formally acknowledge the possibility that some complex interaction may take place between dominance of the S-alleles and dominance of the linked deleterious mutations (lines 440-442). However, as explained above we feel that fully exploring this complex interaction would require a detailed investigation, which is clearly beyond the scope of the present study.

      Rather they only show the disappointing or less exciting result that fully recessive, weakly deleterious mutations (which I again think do not even exist in nature as I said above) have minor, to no effect across the classes of S-allele dominance. They provide no insight into whether any type of recessive deleterious mutation can accumulate under the S-allele dominance hierarchy, and that is the interesting question at hand. I would either remove these simulations or redo them in another approach. The authors never mention what simulation approach was used, so I can only assume this is custom, in-house code. Yet I do not find that code provided on the github page. I do not know if the lack of a distribution for h and s values is then a choice or a programming limitation, but I see it as one that should be overcome if these simulations are meant to be meaningful to the results of the study.

      The code we used (in C) was adapted from the previous study by Llaurens et al. (2009), which at the time was not deposited in a data repertory, unfortunately. With the agreement of the authors of that study, this code is now available on Github:

      (https://github.com/leveveaudrey/model_ssi_Llaurens; line 723).

      It is correct that our simulations were not aimed at determining whether “any type of recessive deleterious mutation can accumulate”, but we strongly believe that they help interpreting the observations made in the genomic data.

      Recommendations for the authors:

      Notes from the editor:

      I found Table 1 confusing, with column headings of observed proportion but perhaps numbers reflecting counts.

      Thank you for pointing out this confusion. There was indeed an error in the last column, which we have now corrected.

      I found Figure 2 a bit hard to parse, with the vertical lines being unclear and the x-axis ticks of insufficient resolution to evaluate the physical extent of the signals.

      We increased the size of the label on the x-axis and detailed it on the Figure 2, which is now hopefully more clear. Moreover, we increase the size of the vertical lines.

      Finally, I wonder, given the rapid decay of signal in lyrata, whether 25kb is the right choice for evaluating load and whether the pattern may look different on a smaller scale.

      It is true that the signal decays rapidly in A. lyrata, as can be seen in the haplotype structure analysis and in line with our previous analysis of the same populations Le Veve et al (MBE 2023; in this study we explored the effect of the choice of the size of the chromosomal region analyzed; lines 266-269). However, for the sake of comparison, we prefer to stick to the same window size. The fact that we still see an effect of dominance in spite of the lower statistical power associated with the more rapid decay (because a smaller number of genes is expected to be impacted) actually reinforces our conclusions.

      Reviewer #1 (Recommendations For The Authors):

      I have a few additional suggestions to improve the manuscript.

      (1) How does the load linked to the S-locus compare to that observed in other genomic regions? It would be useful to provide a comparison of the results quantified in Figures three and four to comparable genomic regions unlinked to the S-locus. How severe is the linked load?

      This comparison to the genomic background was actually the core of our previous study (Le Veve et al MBE 2023), which was based on the same populations. This analysis revealed that polymorphism of the 0-fold degenerate sites was more than twice higher in the 25kb immediately flanking the S-locus than in a series of 100 unlinked control regions. Here, the main focus of the present study is on the effect of linkage to particular S-alleles (which was not possible previously because haplotypes had to be phased).

      (2) Details of the GLM for data underlying Figures 3 and 4 are somewhat unclear. Is the key explanatory variable (Dominance) treated as continuous? Categorical? Ordinal etc…

      Dominance is considered as a continuous variable. We specify this in line 162 of the results, in the legends of Figures 3 and 4, in the Material and Method (lines 627 and 660) and in the legend of Table S4.

      (3) I had some trouble understanding the two different p-values in columns five and six of table one. Please provide more detail.

      We understand that the two p-values in Table 1 were confusing. The first was related to the binomial test and the second to the permutation test. To be consistent with the rest of the manuscript, we conserved only the p-value of the permutation test.

      (4) As mentioned in the "weaknesses" above, the authors should be more clear about what they are quantifying. They are explicitly counting the number of variants at 0-fold degenerate sites as a proxy for the genetic load. How good this proxy is is unclear. The most egregious misstatement here was on line 314 in which they make reference to the "total load." However, this limitation should be acknowledged throughout the manuscript and deserves more attention in the methods and discussion.

      As mentioned above, we now integrate additional methods to define and quantify the load (SIFT4G and SNPeff), which reinforced our previous conclusions (lines 271-272, 297-302).

      We clarified our wording and replaced the mention of “total load” by “mean number of linked deleterious mutations per copy of S-allele” (line 324-325). In the discussion we tried to better explain the limitations of approaches to estimate the genetic load (line 431-437).

      Reviewer #2 (Recommendations For The Authors):

      Line 60, it should be specified that this is only for recessive deleterious mutations.

      Non-recessive deleterious mutations would certainly not be expected to accumulate.

      As explained in details above, the question of whether and how non-recessive deleterious mutations can accumulate when linked to the S-locus is difficult and would in itself deserve a full treatment, which is clearly beyond the scope of the present study. We clarified this point on line 56.

    3. eLife assessment

      This study presents valuable empirical work and simulations that are relevant for the evolution of genetic load linked to self-incompatibility alleles in two Arabidopsis species. The evidence supporting the findings is solid, although it remains to be seen how generalizable the conclusions are beyond the specific system investigated here, not least because the statistical significance varied between the two species. The work will be of relevance to geneticists interested in the evolution of allelic diversity in similar systems.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment 

      The authors provide solid data on a functional investigation of potential nucleoid-associated proteins and the modulation of chromosomal conformation in a model cyanobacterium. While the experiments presented are convincing, the manuscript could benefit from restructuring towards the precise findings; alternatively, additional data buttressing the claims made would significantly enhance the study. These valuable findings will be of interest to the chromosome and microbiology fields.

      We appreciate editors for taking time for assessment and reviewers for giving critical suggestions. Both reviewers were concerned about our interpretation of 3C data, and Reviewer #2 suggested the biochemistry of cyAbrB2 to reinforce our claim. We agree with the concern and suggest editors add a sentence “How cyAbrB2 affects chromosome structure is still elusive from this study, and the biochemical assays are needed in the future experiment.” to the eLife assessment.

      The major revision points are the following;

      Reconstruction of Figures

      Previous Figure 5E has been omitted

      Additional 3C data on the nifJ region

      Rephrasing the conclusion of 3C data

      Additional discussion on cyAbrB2 and NAPs

      Reviewer #1 (Public Review): 

      Strength: 

      At first glance, I had a very positive impression of the overall manuscript. The experiments were well done, the data presentation looks very structured, and the text reads well in principle.

      Weakness: 

      Having a closer look, the red line of the manuscript is somewhat blurry. Reading the abstract, the introduction, and parts of the discussion, it is not really clear what the authors exactly aim to target. Is it the regulation of fermentation in cyanobacteria because it is under-investigated? Is it to bring light to the transcriptional regulation of hydrogenase genes? The regulation by SigE? Or is it to get insight into the real function of cyAbrB2 in cyanobacteria? All of this would be good of course. But it appears that the authors try to integrate all these aspects, which in the end is a little bit counterintuitive and in some places even confusing. From my point of view, the major story is a functional investigation of the presumable transcriptional regulator cyAbrB2, which turned out to be a potential NAP. To demonstrate/prove this, the hox genes have been chosen as an example due to the fact that a regulatory role of cyAbrB2 has already been described. In my eyes, it would be good to restructure or streamline the introduction according to this major outcome. 

      As you pointed out, the major focus of this study is cyAbrB2 as a potential NAPs. To focus on NAPs, we simplified the first paragraph of the discussion (ll.246-263) and added the section comparing cyAbrB2 with other known NAPs (11.269-299). To emphasize the description of cyAbrB2, we also rearranged the figures and divided the analysis on cyAbrB2 ChIP into two figures. We reduced the first paragraph of the introduction but mostly preserved the composition of the introduction to keep the general to specific pattern, even though the manuscript is blurry.

      Points to consider: 

      The authors suggest that the microoxic condition is the reason for the downregulation of e.g. photosynthesis (l.112-114). But of course, they also switched off the light to achieve a microoxic environment, which presumably is the trigger signal for photosynthesis-related genes. I suggest avoiding making causal conclusions exclusively related to oxygen and recommend rephrasing (for example, "were downregulated under the conditions applied").

      We agree with this point. We rephrased l.114 to “by the transition to dark microoxic conditions from light aerobic conditions” (ll.108-109).

      The authors hypothesized that cyAbrB2 modulates chromosomal conformation and conducted a 3C analysis. But if I read the data in Figure 5B & C correctly, there is a lot of interaction in a range of 1650 and 1700 kb, not only at marked positions c and j. Positions c and j have been picked because it appears that cyAbrB2 deletion impacts this particular interaction. But is it really significant? In the case of position j the variation between the replicates seems quite high, in the case of position c the mean difference is not that high. Moreover, does all this correlate with cyAbrB2 binding, i.e. with positions of gray bars in panel A? If this was the case, the data obtained for the cyabrB2 mutant should look totally different but they are quite similar to WT. That's why the sentence "By contrast, the interaction frequency in Δcyabrb2 mutant were low and unchanged in the aerobic and microoxic conditions" does not fit to the data shown. But I have to mention that I am not an expert in these kinds of assays. Nevertheless, if there is a biological function that shall be revealed by an experiment, the data must be crystal clear on that. At least the descriptions of the 3C data and the corresponding conclusions need to be improved. For me, it is hard to follow the authors' thoughts in this context. 

      According to your suggestion, we again have carefully observed the 3C data. Furthermore, we conducted an additional 3C experiment on nifJ region (Figures 7F-J). Then we admit we had overinterpreted the 3C data. Therefore, we rewrote the result and discussion of the 3C assay in line with the data (ll.220-245) and removed the previous Figure 5E. Following are individual responses.

      Positions c and j have been picked because it appears that cyAbrB2 deletion impacts this particular interaction. But is it really significant?

      We could not find statistically significant differences at locus c and j. Therefore, we added this in the result section “Note that the interaction scores exhibit considerable variability and we could not detect statistical significance at those loci.” (ll.231-232)

      does all this correlate with cyAbrB2 binding, i.e. with positions of gray bars in panel A?

      As you are concerned, interaction frequency and cyAbrB2 binding do not correlate. Therefore, we withdraw the previous claim and stated as follows; “Moreover, our 3C data did not support bridging at least in hox region and nifJ region, as the high interaction locus and cyAbrB2 binding region did not seem to correlate (Figure 7).” (ll.280-282)

      If this was the case, the data obtained for the cyabrB2 mutant should look totally different but they are quite similar to WT.

      We rewrote it as follows; “Then we compared the chromatin conformation of wildtype and cyabrb2∆. Although overall shapes of graphs did not differ, some differences were observed in wildtype and cyabrb2∆ (Figures 7B and 7G); interaction of locus (c) with hox region were slightly lower in cyabrb2∆ and interaction of loci (f’) and (g’) with nifJ region were different in wildtype and cyabrb2∆. Note that the interaction scores exhibit considerable variability and we could not detect statistical significance at those loci.” (ll.228-232)

      That's why the sentence "By contrast, the interaction frequency in Δcyabrb2 mutant were low and unchanged in the aerobic and microoxic conditions" does not fit to the data shown.

      We rewrote the sentence as follow; “While the interaction scores exhibit considerable variability, the individual data over time demonstrate declining trends of the wildtype at locus (c) and (j) (Figure S8). In ∆cyabrb2, by contrast, the interaction frequency of loci (c) and (j) was unchanged in the aerobic and microoxic conditions (Figure 7E). The interaction frequency of locus (c) in ∆cyabrb2 was as low as that in the microoxic condition of wildtype, while that of locus (j) in ∆cyabrb2 was as high as that in the aerobic condition of wildtype (Figures 7B and 7C).” (ll.238-243)

      The figures are nicely prepared, albeit quite complex and in some cases not really supportive of the understanding of the results description. Moreover, they show a rather loose organization that sometimes does not fit the red line of the results section. For example, Figure 1D is not mentioned in the paragraph that refers to several other panels of the same figure (see lines110-128). Panel 1D is mentioned later in the discussion. Does 1D really fit into Figure 1 then? Are all the panels indeed required to be shown in the main document? As some elements are only briefly mentioned, the authors might also consider moving some into the supplement (e.g. left part of Figure 1C, Figure 2A, Figure 3B ...) or at least try to distribute some panels into more figures. This would reduce complexity and increase comprehensibility for future readers. Also, Figure 3 is a way too complex. Panel G could be an alone-standing figure. The latter would also allow for an increase in font sizes or to show ChIP data of both conditions (L+O2 and D-O2) separately. Moreover, a figure legend typically introduces the content as a whole by one phrase but here only the different panels are described, which fits to the impression that all the different panels are not well connected. Of course, it is the decision of the authors what to present and how but may they consider restructuring and simplifying.

      According to the advice, we have rearranged the Figure composition.

      The left side of Figure 1C has been moved to supplement. Instead, representative expression fold changes of “Transient”, “Plateau”, “Continuous”, and “Late” genes are shown for comprehensibility. We left Figure 1D in Figure 1, as this diagram shows our motive to focus on hox and nifJ. We moved Figure 2A to supplement. We did not move Fig3B, as this figure shows the distribution of cyAbrB2 (“long tract of AT-rich DNA”) comprehensively and simply. We agree that Figure 3 was too complex. Therefore, we moved Figures 3F and 3G to a new independent figure (Figure 4). In Figure 4C (former 3G), we show the ChIP data of the L+O2 condition only, and the change of ChIP data under the D-O2 condition is shown in Figure 5. The schematic image showing cyanobacterial chromosome and NAPs (previous Figure 5E) was omitted because it was overinterpreting.

      The authors assume a physiological significance of transient upregulation of e.g. hox genes under microoxic conditions. But does the hydrogenase indeed produce hydrogen under the conditions investigated and is this even required? Moreover, the authors use the term "fermentative gene". But is hydrogen indeed a fermentation product, i.e. are protons the terminal electron acceptor to achieve catabolic electron balance? Then huge amounts of hydrogen should be released. Comment should be made on this.

      This is a very important point; Yes, hydrogenase indeed produces hydrogen under the conditions we investigated, and proton accepts a majority of reducing power under the dark microoxic condition. We wrote in the introduction section as follows; “Hydrogen is generated in quantities comparable to lactate and dicarboxylic acids as the result of electron acceptance in the dark microoxic condition (Akiyama and Osanai 2023; Iijima et al. 2016)” (ll.54-55). The detailed explanation is below, although omitted from the manuscript.

      A recent study (Akiyama and Oasanai 2023) quantified the consumed glycogen and secreted fermentative products (hydrogen, lactate, dicarboxylic acid, and acetate) in the Synechocystis under the dark microoxic condition, the same conditions as we investigated. The system of the study consists of a 10 mL liquid layer and a 10 mL gas layer, cultivated for 3 days under dark microoxic conditions. Then the amounts of lactic acid, dicarboxylic acid, and hydrogen were approximately 2 µmol, 3.5 µmol, and 11µmol (assuming the gas layer was at 1 atm and ignoring aqueous population), respectively. On the other hand, glycogen equivalent to 15µmol of glucose was consumed in the system. This estimate supports hydrogen accounts for a substantial portion of fermentative products during dark microoxic conditions.

      The necessity of hydrogen production under dark microoxic conditions was demonstrated in (Gutekunst et al. 2014). They show hydrogenase activity is required for the mixotrophic growth in the light-dark and microoxic cycle with arginine. The necessity remains unclear in our conditions because we only performed continuous dark microoxic conditions without glucose.

      The authors also mention a reverse TCA cycle. But is its existence an assumption or indeed active in cyanobacteria, i.e. is it experimentally proven? The authors are a little bit vague in this regard (see lines 241-246).

      We misused the Terminology. We mean to mention the “reductive branch of TCA”. Cyanobacteria conduct the branched TCA cycle under microoxic conditions. One of the branches is the reductive branch, which reduces oxaloacetate to produce malate. We corrected “reverse TCA cycle” to “reductive branch of TCA”. (Figure 1D and ll.260-262)

      Reviewer #2 (Public Review): 

      This work probes the control of the hox operon in the cyanobacterium Synechocystis, where this operon directs the synthesis of a bidirectional hydrogenase that functions to produce hydrogen. In assessing the control of the hox system, the authors focused on the relative contributions of cyAbrB2, alongside SigE (and to a lesser extent, SigA and cyAbrB1) under both aerobic and microoxic conditions. In mapping the binding sites of these different proteins, they discovered that cyAbrB2 bound many sites throughout the chromosome repressed many of its target genes, and preferentially bound regions that were (relatively) rich in AT-residues. These characteristics led the authors to consider that cyAbrB2 may function as a nucleoid-associated protein (NAP) in Synechocystis, given its functional similarities with other NAPs like H-NS. They assessed the local chromosome conformation in both wild-type and cyabrB2 mutant strains at multiple sites within a 40 kb window on either side of the hox locus, using a region within the hox operon as bait. They concluded that cyAbrB2 functions as a nucleoid-associated protein that influences the activity of SigE through its modulation of chromosome architecture.

      The authors approached their experiments carefully, and the data were generally very clearly presented and described.

      Based on the data presented, the authors make a strong case for cyAbrB2 as a nucleoid-associated protein, given the multiple ways in which it seems to function similarly to the well-studied Escherichia coli H-NS protein. It would be helpful to provide some additional commentary within the discussion around the similarities and differences of cyAbrB2 to other nucleoid-associated proteins, and possible mechanisms of cyAbrB2 control (post-translational modification; protein-protein interactions; etc.). The manuscript would also be strengthened with the inclusion of biochemical experiments probing the binding of cyAbrB2, particularly focusing on its oligomerization and DNA polymerization/bridging potential.

      We agree with the comment that the biochemical experiments will deepen our insights into the cyAbrB2 and chromatin conformation. As the reviewer pointed out, the biochemical assay will provide valuable information on mechanisms of cyAbrB2 control, such as post-transcriptional modification, cooperation with cyAbrB1, oligomerization, and the structure of cyAbrB2-bound DNA. However, we think those potential findings are worth of new independent research paper, rather than a part of this paper. Therefore, we added a discussion mentioning biochemistry as the future work (ll.275-290; the section of “The biochemistry of cyAbrB2 will shed light on the regulation of chromatin conformation in the future”).

      Previous work had revealed a role for SigE in the control of hox cluster expression, which nicely justified its inclusion (and focus) in this study. However, the results of the SigA studies here suggested that SigA both strongly associated with the hox promoter, and its binding sites were shared more frequently than SigE with cyAbrB2. The focus on cyAbrB2 is also well-justified, given previous reports of its control of hox expression; however, it shares binding sites with an essential homologue cyAbrB1. Interestingly, while the B1 protein appears to bind similar sites, instead of repressing hox expression, it is known as an activator of this operon. It seems important to consider how cyAbrB1 activity might influence the results described here.

      We infer that the minor side of the bimodal SigE peak is the genuine population that contributes to hox transcription, as hox genes are expressed in a SigE-dependent manner (Figure S2). We considered the strong SigA peak upstream of the hox operon binds the promoter of TU1715, the opposite direction of the hox operon. We added a description of the single SigA peak and bimodal SigE peak near the TSS of the hox operon as follows;

      “A bimodal peak of SigE was observed at the TSS of the hox operon in a microoxic-specific manner (Figure 6C bottom panel). The downstream side of the bimodal SigE peak coincides with SigA peak and the TSS of TU1715. Another side of the bimodal peak lacked SigA binding and was located at the TSS of the hox operon (marked with an arrow in Figure 6C), although the peak caller failed to recognize it as a peak.” (ll.206-209)

      The point that cyAbrB1 binds similar sites as cyAbrB2, despite regulating hox expression in the opposite direction, is very interesting. Therefore, we referred to the transcriptome data of the cyAbrB1 knockdown strain and compared the impact of cyAbrB1 knockdown and cyAbrB2 deletion. We described in result and discussion as follows;

      “we referred to the recent study performing transcriptome of cyAbrB1 knockdown strain, whose cyAbrB1 protein amount drops by half (Hishida et al. 2024). Among 24 genes induced by cyAbrB1 knockdown, 12 genes are differentially downregulated genes in cyabrb2∆ in our study (Figure S5D).” (ll.162-165)

      “CyAbrB1, the homolog of cyAbrB2, may cooperatively work, as cyAbrB1 directly interacts with cyAbrB2 (Yamauchi et al. 2011), their distribution is similar, and they partially share their target genes for suppression (Figures 3A S5C and S5D). The possibility of cooperation would be examined by the electrophoretic mobility shift assay of cyAbrB1 and cyAbrB2 as a complex. Despite their similar repressive function, cyAbrB1 and cyAbrB2 regulate hox expression in the opposite directions, and their mechanism remains elusive.” (ll.292-296)

      Hox operon differs from this general tendency. To see if cyAbrB1 behaves differently from cyAbrB2 in the hox operon, we did an additional ChIP-qPCR experiment on cyAbrB1 in the aerobic condition and the dark microoxic condition (Figure 5C). However, we could not find the difference.

      Reviewer #1 (Recommendations For The Authors): 

      Figure 1B: I recommend changing the header in the grey bar to terms like "upregulated" and "downregulated", which are also used in the legend description. Upregulation of genes can also be a result of de-repression, which is why the term "activated" is somewhat misleading.

      Corrected.

      Lines 114-116: It is unclear what the authors exactly mean here. Please clarify. 

      We rephrase the sentence “The enrichment in the butanoate metabolism pathway indicates the upregulation of genes involved in carbohydrate metabolism. We further classified genes according to their expression dynamics.” (ll.110-111)

      Reviewer #3 (Recommendations For The Authors): 

      Major/experimental comments: 

      (1) For the chromosome conformation capture experiments, it is indicated that these were conducted at aerobic (1hr) and microoxic (4 hr) conditions. But the data presented in Figure 1 suggest that 1 hr corresponds to the beginning of microoxic growth, and that time 0 is aerobic. The composite 3C data in Figure 5 show some interesting but specific differences. It is appreciated that the authors presented the profiles for individual samples in Figure S7, and the differences here do not seem to be as compelling. Are the major differences being highlighted significantly (statistically) different (e.g. at the (c) and (j) loci)? Might the differences be starker if an earlier aerobic condition (e.g. time 0) had been used instead of the 1 hr - microoxic - timepoint?

      Previous Figure 5 consisted of three time points (solid line: aerobic condition, dashed line:1hr of microoxic condition, and dotty line:4hr of microoxic condition). We omitted data of 4hr in the main figure (Figure 7) as 4hr in microoxic conditions makes data complicated. Three time points are shown in the profiles of individual loci (Figure S8).

      There is no statistical significance found in (c) and (j) loci by t-test. Therefore, we have toned down the interpretation of 3C data as follows; “Our 3C result demonstrated that cyAbrB2 influences the chromosomal conformation of hox and nifJ region to some extent (Figure 7).” (ll.325-326)

      (2) This is a complicated system that involves multiple regulatory proteins, each of which is differentially affected by the growth conditions (aerobic/microoxic). It is obviously beyond the scope of this work to probe deeply into all of these proteins. The focus here was on cyAbrB2, and to a slightly lesser extent SigE; however, based on the data presented, it seems that SigA and cyAbrB1 may be equally important contributors to hox control/expression, and in the case of cyAbrB1, possibly also to chromosome conformation. cyAbrB1 appears to have the same binding sites as cyAbrB2, and has been reported to interact with cyAbrB2. Given this association, it is possible that the two proteins may affect the binding of each other, and that loss of one might lead to enhanced binding by the other (or binding may require heterooligomerization?). Probing the regulatory interplay between these two proteins (or at least discussing it) feels important. Conducting e.g. mobility shift assays with each protein, both individually and together, could possibly allow for some understanding of how they function together. 

      We agree that the biochemistry of cyAbrB2 and cyAbrB1 may explain why cyAbrB1 and cyAbrB2 bind long tracts of AT-rich genome regions in vitro. We would like to put the biochemistry future plan as we think biochemistry data is beyond the present study.

      The idea that cyAbrB1 and cyAbrB2 cooperate to form heterooligomers and broad binding to the genome is a very rational and interesting prediction. We add this idea to the discussion “Overall, the biochemistry integrating assay conditions (PTM, buffer condition, and cooperation with cyAbrB1) and output (DNA binding, oligomerization, and DNA structure) will deepen the understanding of cyAbrB2 as cyanobacterial NAPs.”(ll.287-290). We also compared our transcriptome of ∆_cyabrb2 with the recent study of cyabrb1 knockdown (ll. 162-165), and concluded “they partially share their target genes for suppression (Figures 3A S5C and S5D)” (l. 293).

      (3) Throughout the manuscript, there is reference made to cyAbrB2 binding becoming 'blurry' or non-specific under microoxic conditions. It is not clear what this means. It appears that when cyAbrB2 binds, any given protected region can be quite extensive, which can be suggestive of polymerization along the chromosome. Are the boundaries for binding sites typically clearly delineated, and this changes when the cultures are growing under microoxic conditions? There is also no mention made anywhere about oligomerization potential for cyAbrB2, which would be important for the polymerization, and bridging suggested for cyAbrB2 in the model presented in Figure 5. Previous publications (Song et al., 2022; Ishi et al., 2008) have suggested that it can exist as a dimer in vivo, but that in vitro it is largely monomeric. The manuscript would benefit from some additional biochemical analyses of cyAbrB2 binding activity, with a particular focus on DNA binding and oligomerization/bridging potential, and some additional discussion about these characteristics as well. 

      Throughout the manuscript, there is reference made to cyAbrB2 binding becoming 'blurry' or non-specific under microoxic conditions. It is not clear what this means.

      In order to clearly describe “cyAbrB2 binding becomes blurry”, we rearranged the figure composition and made an exclusive figure (Figure 5). We also rephrased the description by adopting the reviewer’s word “boundaries for binding sites”, as this phrase well describes the change. “When cells entered microoxic conditions, the boundaries of the cyAbrB2 binding region and cyAbrB2-free region became obscure (Figure 5), “(ll.319-320)

      There is also no mention made anywhere about oligomerization potential for cyAbrB2,

      We added the discussion about oligomerization “DNA-bound cyAbrB2 is expected to oligomerize, based on the long tract of cyAbrB2 binding region in our ChIP-seq data. However, no biochemical data mentioned the DNA deforming function or oligomerization of cyAbrB2 in the previous studies and preference for AT-rich DNA is not fully demonstrated in vitro (Dutheil et al. 2012; Ishii and Hihara 2008; Song et al. 2022)”(ll. 277-280) and “Overall, the biochemistry integrating assay conditions (PTM, buffer condition, and cooperation with cyAbrB1) and output (DNA binding, oligomerization, and DNA structure) will deepen the understanding of cyAbrB2 as cyanobacterial NAPs.” (ll.287-290)

      The manuscript would benefit from some additional biochemical analyses of cyAbrB2 binding activity, with a particular focus on DNA binding and oligomerization/bridging potential, and some additional discussion about these characteristics as well. 

      We added the discussion integrally considering known features of cyAbrB2, novel findings on cyAbrB2, and the comparison with known NAPs (ll.269-290).

      (4) Given that the major take-away for the authors (based on the title) seems to be the nucleoid-associated protein potential for cyAbrB2, the Discussion would benefit from some additional focus in this area. How similar is cyAbrB2 to other nucleoid-associated proteins? (e.g. H-NS, Lsr2) How does counter-silencing work for other nucleoid-associated proteins? Can the authors definitively exclude the possibility of binding site competition/occlusion, given that cyAbrB2 covers the promoter region of hox? What is other nucleoid-associated proteins have been characterized in the cyanobacteria? 

      We agree with the point, so we additionally discussed cyAbrB2 comparing with H-NS and Lsr2, the canonical NAPs (ll. 269-290).

      We did not deny the possibility of the exclusion of RNAP by cyAbrB2, but the previous manuscript insufficiently discussed that. To emphasize that cyAbrB2 excludes RNA polymerase, we simplified Figure 6 and employed mosaic plots showing anti-co-occurrence of cyAbrB2 binding regions and SigE peaks. Furthermore, we added discussion about SigE exclusion by cyAbrB2 (ll. 355-359)

      We mention the possibility of other nucleoid-associated proteins in cyanobacteria in the discussion. “Furthermore, the conformational changes by deletion of cyAbrB2 were limited, suggesting there are potential NAPs in cyanobacteria yet to be characterized.” (ll.336-339)

      (5) Previous work (Song et al., 2022) showed that changing the AT content of cyAbrB2 binding sites did not affect its ability to bind DNA. There are also previous papers suggesting that cyAbrB2 may be subject to diverse post-translational modifications (e.g. phosphorylation - Spat et al., 2023; glutationylation - Sakr et al., 2013), as well as association with cyAbrB1. These collectively suggest there may be other factors that contribute to cyAbrB2 binding specificity/activity. These seem like relevant points to discuss, particularly given the transient nature of the cyAbrB2 effects on some genes.

      We have included the discussion about AT content, post-translational modifications and transient regulations, and association with cyAbrB1 (ll. 284-295)

      (6) Given the major binding site for SigA upstream of the hox operon, it seems that it likely also contributes to hox cluster expression, together with SigE. Is there a sense for the relative contribution of each sigma factor to hox cluster expression? And whether both are subject to the same inhibitory effect of cyAbrB2? 

      As described above response to the public review, the SigA binding site upstream of the hox operon should be assigned to the TSS of TU1715 (Figure 6C). Transcription of hox operon is highly dependent on SigE as shown in Figure S2, and residual transcription in sigE∆ strain is derived from other sigma factors (SigABCD). Estimating the relative contribution of sigma factors other than SigE is difficult at present because SigABCDE can partially compensate for each other.

      As the different impact of NAPs on the primary and alternative sigma factor is observed in H-NS (Shin et al. 2005), whether both the primary sigma factor (SigA) and the alternative sigma factor (SigE) are inhibited by cyAbrB2 to the same extent is a very interesting question.

      We calculated the odds ratio of SigE and SigA being in the cyAbrB2-free region and wrote in the result; “SigE preferred the cyAbrB2-free region in the aerobic condition more than SigA did (Odds ratios of SigE and SigA being in the cyAbrB2-free region were 4.88 and 2.74, respectively).” (ll.193-195) and discussed “The higher exclusion pressure of cyAbrB2 on SigE may contribute to sharpening the transcriptional response of hox and nifJ on entry to microoxic conditions.” (ll.357-359)

      (7) The 3C experiments suggest there are indeed changes in chromosome architecture in the hox region as growth conditions change and when different regulators are present. Across the chromosome, analogous changes are expected; however, it may be premature to draw this conclusion based on changes at one locus. Is there a reason that the authors did not take full advantage of their 3C samples and sequence them, to capture the full chromosome interactome at the two time-points? This would allow broader conclusions to be drawn regarding changes in chromosome structure and the impact of cyAbrB2.

      In response to the suggestion, we performed an additional 3C assay on the nifJ region by utilizing residual 3C samples. Expanding to genome-wide sequence (Hi-C) needs concentration of ligated fragments by the biotinylation, which were omitted in our 3C sample.

      We rewrote the result as obtained from the 3C data of hox and nifJ (ll.220-245) and omitted the schematic image of an entire chromosome of cyanobacteria (previous Figure 5E).

      Editorial comments: 

      (1) The data presentation in Figure 1 is very effective. 

      (2) Line 87: please rephrase - you can have 'high similarity' or 'high levels of identity', but not high levels of homology - genes/proteins are either homologous or not.

      (3) Line 118: classified into four 'groups'? 

      (4) Line 590: remove 'the'. 

      (5) Figure 2S, panel B: please define acronyms in the legend (GT, IP) and write out 'FLAG' in full for AbrB1.

      (2) to (5) have been corrected.

      (6) Please provide information on or a reference for the tagging of SigA for use in the ChIP-seq experiments within the Materials and Methods.

      Added (l.365)

      (7) Line 648: space between 'binding' and 'regions'. 

      corrected.

      (8) Fig 4E: please make the solid lines thicker - they are currently difficult to see.

      We have made Figure 6C (former 4E) larger and the line thicker.

      (9) Line 666: location. 

      (10) Line 673: Individual. 

      (11) Figure S5, panel C graph title: should this be 'Relative'? 

      (12) Figure S7: What is 'GT'? Should this be 'WT'? 

      (9) to (12) have been corrected.

      (13) In addition to the data presented in Figure 3G, it would be nice to have a small table or Venn diagram summarizing the number of cyAbrB2 binding sites that fall into the different categories (full gene/operon; downstream of a gene; within a gene; promoter region). 

      In response to the comment, we noticed the categories we had applied (full gene/operon; downstream of a gene; within a gene; promoter region) were arbitrary. Therefore, we categorized transcriptional units (TUs) according to the extent of occupancy by cyAbrB2. (Figures 4B and 4C)

      (14) Line 280-281: suggest replacing 'mediates' with 'influences'. 'Mediates' sounds like a direct interaction (for which the evidence is not currently strong without some additional biochemical data), but 'influences' could better accommodate both direct and indirect possibilities. 

      (15) Line 410: it is not clear what this means. 

      We have omitted “As a result, DNA ~600-fold condensed DNA than 3C samples were ligated.”, as it does not give any information about the experimental procedure.

    2. eLife assessment

      The authors provide solid data on a functional investigation of potential nucleoid-associated proteins and the modulation of chromosomal conformation in a model cyanobacterium. These valuable findings will be of interest to the chromosome and microbiology fields. Additional analysis and the tempering of conclusions has helped to improve the work, although further refinement remains possible.

    3. Reviewer #3 (Public Review):

      This work probes the control of the hox operon in the cyanobacterium Synechocystis, where this operon directs the synthesis of a bidirectional hydrogenase that functions to produce hydrogen. In assessing the control of the hox system, the authors focused on the relative contributions of cyAbrB2, alongside SigE (and to a lesser extent, SigA and cyAbrB1) under both aerobic and microoxic conditions. In mapping the binding sites of these different proteins, they discovered that cyAbrB2 bound many sites throughout the chromosome, repressed many of its target genes, and preferentially bound regions that were (relatively) rich in AT-residues. These characteristics led the authors to consider that cyAbrB2 may function as a nucleoid-associated protein (NAP) in Synechocystis, given the functional similarities with other NAPs like H-NS. They assessed the local chromosome conformation in both wild type and cyabrB2 mutant strains at multiple sites within a 40 kb window on either side of the hox locus, using a region within the hox operon as bait. They concluded that cyAbrB2 functions as a nucleoid associated protein that influences the activity of SigE through its modulation of chromosome architecture.

      The authors approached their experiments carefully, and the data were generally very clearly presented. At the same time, the overall work contains many lines of inquiry and different protein investigations that in some ways made it more challenging to identify the overall take-away message(s).

      Based on the data presented, the authors make a strong case for cyAbrB2 as a nucleoid-associated protein, given the multiple ways in which is seems to function similarly to the well-studied Escherichia coli H-NS protein. They now provide additional commentary that relates cyAbrB2 with other nucleoid-associated proteins.

      Previous work had revealed a role for SigE in the control of hox cluster expression, which nicely justified its inclusion (and focus) in this study. The focus on cyAbrB2 is also well-justified, given previous reports of its control of hox expression; however, it shares binding sites with an essential homologue cyAbrB1. Interestingly, while the B1 protein appears to bind similar sites, instead of repressing hox expression, it is known as an activator of this operon. If the information on cyAbrB1 is retained in the manuscript, it would be important to consider how cyAbrB1 activity might influence the results described here (although the authors could also consider removing the cyAbrB1 information to help improve the focus of the manuscript).

    1. eLife assessment

      In this manuscript the authors present high-speed atomic force microscopy (HSAFM) to analyze real-time structural changes in actin filaments induced by cofilin binding. This important study enhances our understanding of actin dynamics which plays a crucial role in a broad spectrum of cellular activities based on solid experimental evidence. Some technical questions, however, remain, making the data interpretation incomplete.

    2. Reviewer #1 (Public Review):

      The authors provided a detailed analysis of the real-time structural changes in actin filaments resulting from cofilin binding, using High-Speed Atomic Force Microscopy (HSAFM). The cofilin family controls the lifespan of actin filaments in cells by severing the filament and promoting depolymerization. Understanding the effects of cofilin on actin filament structure is critical. It is widely acknowledged that cofilin binding significantly shortens the pitch of the actin helix. The authors previously reported (1) that this shortening extends to the unbound region of the actin filament on the pointed end side of the cofilin binding cluster. In this study, the authors presented substantially improved AFM images and provide detailed accounts of the dynamics observed. It was found that a minimal cofilin-binding cluster, consisting of 2-4 molecules, could induce changes in the helical parameters over one or more actin crossover repeats. Adjacent to the cofilin-binding clusters, the actin crossovers were observed to shorten within seconds, and this shortening was limited to one side of the cluster. Additionally, the phosphate binding to the actin filament was observed to stabilize the helical twist, suggesting a mechanism in which cofilin preferentially binds to ADP-bound actin filaments. These findings significantly advance our understanding of actin filament dynamics which is essential for a wide of cellular processes.

      However, two insufficient parts exist. Readers should be aware of possible errors in the Mean Axial Distance (MAD) analysis and the limitations of discussions about the actin subunit structure.

      The authors have presented findings that the MAD within actin filaments exhibits a significant dependency on the helical twist. However, difficulty in determining each subunit interval from the AFM image might affect the analysis. For example, the observation of three peaks in HHP6 of Figure Supplement 6C, corresponding to 4.5 pairs, showed peak intervals of 5, 11.8, 8.7, and 5.7 nm (measured from the figure). The second region (11.8 nm) appears excessively long. If one peak is hidden in the second region, the MAD becomes 5.5 nm.

      The authors also suggest a strong link between the C-form (cofilin binding form of actin found in cofilactin) and the formation of regions of the short pitch helix outside the cofilin binding cluster. However, the AFM observation did not provide any evidence about the actin form in these regions because of measurement limitations. Additionally, Oda et al. (2) have demonstrated that the C-form is highly unstable in the absence of cofilin binding, casting doubt on the possibility of the C-form propagating without cofilin binding. The "C-actin-like structure" in the paper is not necessarily related to the C-form actin. It might be one of the G-forms (monomeric actin forms) or another unknown form.

      (1) K. X. Ngo et al., a, Cofilin-induced unidirectional cooperative conformational changes in actin filaments revealed by high-speed atomic force microscopy. eLife 4, (2015).<br /> (2) T. Oda et al., Structural Polymorphism of Actin. Journal of molecular biology 431, 3217-3228 (2019).

    3. Reviewer #2 (Public Review):

      Summary:

      This study by Ngo et al. uses mostly high-speed AFM to estimate conformational changes within actin filaments, as they get decorated by cofilin. The authors build on their earlier study (Ngo et al. eLife 2015) where they used the same technique to monitor the expansion of cofilin clusters on actin filaments, and the propagation of the associated conformational changes in the filament (reduction of the helical pitch). Here, they propose a higher-resolution description of the binding of cofilin to actin filaments.

      Strengths:

      The high speed AFM technique used here is quite original to address this question, compared to more classical light and electron microscopy techniques. It can certainly bring valuable information as it provides a high spatial resolution while monitoring live events. Also, in this paper, a nice effort was made to make the 3D structures and conformational changes clear and understandable.

      Weaknesses:

      In spite of the authors' response to my earlier comments, I still have concerns regarding the AFM technique. In particular, regarding the interactions of the filaments with the surface, which I still find unclear and potentially problematic.

      The filaments appear densely packed on the surface, and even clearly in register in some images (if not most images, e.g., Figs 3AD, 4BC, 5A, 8AC). I understand that there are practical reasons for this, but isn't there a risk that this could affect the result? Maybe I did not understand the authors' response well enough, but I did not see a clear control that would alleviate my concern.

      The properties of the lipid layer and its interaction with the actin filaments are still unclear to me. A poor control of these interactions is a problem if one aims to measure conformational changes at high resolution. The strength of the interaction appears tuned by the ratio of lipids put on the surface to change its electrostatic charge. A strong attachment likely does more than suppress torsional motion (as claimed in Fig 8A). It may also hinder cofilin binding in several ways (lower availability of binding sites on the filament facing the surface, electrostatic interactions between cofilin and the surface, etc.). Here again, I was not fully reassured by the authors' response.

      The identification of cofilactin regions relies on the additional height of the "peaks", due to the presence of cofilin. It thus seems that cofilin is detected every half helical pitch (HHP), and I still don't understand how the authors can make reliable claims regarding the presence or absence of cofilin between these peaks.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      The authors provided a detailed analysis of the real-time structural changes in actin filaments resulting from cofilin binding, using High-Speed Atomic Force Microscopy (HSAFM). The cofilin family controls the lifespan of actin filaments in the cells by severing the filament and promoting depolymerization. Understanding the effects of cofilin on actin filament structure is critical. It is widely acknowledged that cofilin binding significantly shortens the pitch of the actin helix. The authors previously reported (1) that this shortening extends to the unbound region of the actin filament on the pointed end side of the cluster. In this study, the authors presented substantially improved AFM images and provide detailed accounts of the dynamics observed. It was found that a minimal cofilin-binding cluster, consisting of 2-4 molecules, could induce changes in the helical parameters over one or more actin crossover repeats. Adjacent to the cofilin-binding clusters, the actin crossovers were observed to shortened within seconds, and this shortening was limited to one side of the cluster. Additionally, the phosphate binding to the actin filament was observed to stabilize the helical twist, suggesting a mechanism in which cofilin preferentially binds to ADP-bound actin filaments. These findings significantly advance our understanding of actin filament dynamics which is essential for a wide of cellular processes.<br /> However, I propose that the sections about MAD and certain parts of the discussions need substantial revisions.

      In this study, we leverage high spatiotemporal resolutions of high-speed atomic force microscopy (HS-AFM) to analyze real-time structural changes in actin filaments induced by cofilin binding. Furthermore, we experimentally demonstrate the inherent variability in twist conformations of bare actin filaments. Our study integrates HS-AFM with Principal Component Analysis (PCA) to elucidate the actin structure-dependent preferential cooperative binding of cofilin. We provide experimental evidence to substantiate a "proof of principle" regarding the flexible helical twists of actin filaments that regulate the functions of actin-binding proteins. This important study enhances our understanding of actin filaments’ dynamics and polymorphic structures which play crucial roles in a broad spectrum of cellular activities.

      We appreciate the comments from Reviewer 1. Below, we address their concerns point by point.

      MAD analysis

      The authors have presented findings that the mean axial distance (MAD) within actin filaments exhibits a significant dependency on the helical twist, a conclusion not previously derived despite extensive analyses through electron microscopy (EM) and molecular dynamics (MD) simulations. Notably, the MAD values span from 4.5 nm (8.5 pairs per half helical pitch, HHP) to 6.5 nm (4.5 pairs/HHP) as depicted in Figure 3C. The inner domain (ID) of actin remains very similar across C, G, and F forms (2, 3), maintaining similar ID-ID interactions in both cofilactin and bare actin filaments, keeping the identical axial distance between subunits in the both states. This suggests that the ID is unlikely to undergo significant structural changes, even with fluctuations in the filament's twist, keeping the ID-ID interactions and the axial distances. The broad range of MAD values reported poses a challenge for explanation. A careful reassessment of the MAD analysis is recommended to ensure accuracy.

      The central challenge to study “Protein Dynamics” in real time lies in bridging the gap in time scales: HS-AFM captures dynamics of proteins within the milliseconds to seconds range, whereas molecular dynamics (MD) simulations typically operate within the femtoseconds to microseconds domain. Protein dynamics encompass a spectrum of temporal scales, from atomic vibrations to molecular tumbling and collective motions in simulations. HS-AFM stands out as a potent technique for delving into protein dynamics, including processes like protein folding and conformational changes triggered by drugs or protein interactions. Additionally, a significant limitation of MD simulation is the spatial modeling constraint (~50 x 50 nm unit), which restricts the study of large complex biological systems. However, utilizing HS-AFM enables the construction of intricate protein models facilitating the real time imaging of their structures and dynamics during functional activity.

      Regarding the suggestion about ID-ID interactions in both cofilactin and bare actin filaments, maintaining identical axial distances (ADs) between subunits in both states, our HS-AFM cannot provide atomic-level structural insights to address this issue. However, we demonstrate that the variability of OD twists in actin protomers could potentially lead to globally shorter half helical pitches (HHPs) and fewer protomer pairs per HHP (Figure 2, Figure supplement 2) (see lines 218-222). The fluctuation in filament’s twist is further supported by currently available experimental data, including our findings (Figure 3C) in this study (see our Discussion in lines 555-560).

      The minimal change in local ID-ID interactions results in an unchanged global length of actin filaments in both cofilin-bound and unbound cases (Figure supplement 2). However, filament’s twists, as experimentally detected by EM, high-resolution interferometric scattering microscopy (iSCAT), HS-AFM, and in pseudo AFM, are changeable (see lines 555-560).

      We have additionally reassessed the fluctuation and dynamics of MAD in F-ADP-actin and F-ADP.Pi-actin over time at high temporal resolution (Figure supplement 3, Video 3, Table supplement 5). These data are further explained in the Results section (lines 264-270).

      Furthermore, we reassessed the broad range of MAD values in F-ADP-actin segments on both sides of large cofilin clusters over time (Figure supplement 8, Video 5). These findings are explained in the Results section (lines 333-337) and further discussed in the new results (lines 555-560).

      In determining axial distances, the authors extracted measurements from filament line profiles. It is advised to account for potential anomalies such as missing peaks or pseudo peaks, which could arise from noise interference. An example includes the observation of three peaks in HHP6 of Figure Supplement 5C, corresponding to 4.5 pairs. Peak intervals measured from the graph were 5, 11.8, 8.7, and 5.7 nm. The second region (11.8 nm) appears excessively long. If one peak is hidden in the second region, the MAD becomes 5.5 nm.

      We acknowledge the difficulty in identifying peaks within the regions of bare actin segments adjacent to cofilin clusters or within the cofilactin region. In the revised Figure supplement 6C (originally Figure supplement 5C), we did not assess peak intervals as suggested by Reviewer 1. The measurement of axial distance (AD) and the number of peaks within a HHP to calculate the correct MAD is further detailed in the Methods section (see HS-AFM data analysis and processing, highlighted in purple).

      Additionally, the purpose of presenting these Figures supplement 6-7 is to directly compare the half helices and the number of protomer pairs per HHP between bare actin filaments and actin segments near the boundary between cofilactin and bare actin segments on the PE side in the same AFM images. In an original version of this paper, we have avoided including the MAD values measured in the cofilactin region (HHP6, HHP7) in Figure Supplement 7E, to mitigate the measurement errors.

      Compiling histograms of axial distances (ADs) rather than focusing solely on MAD may provide deeper insights. If the AD is too long or too short, the authors should suspect the presence of missing peaks or pseudo-peaks due to noise. If 4.4 or 5.5 pairs/HHP regions tend to contain missing peaks and 7.5-8.5 pairs/HHP regions tend to contain pseudo peaks, this may explain the MAD dependency on the helical twist.

      The measurement of axial distance (AD) and the number of peaks within a HHP to calculate the correct MAD is further detailed in the Methods section (see Analyses of pseudo AFM images of F-actin and C-actin structures constructed from existing PDB structures (e.g., Figure supplement 2); and HS-AFM data analysis and processing, highlighted in purple).

      We disagree with Reviewer 1’s suggestion that compiling histograms of ADs, rather than focusing solely on MAD, may provide deeper insights. AFM imaging provides only a 2-dimensional (2D) surface structure, unlike the 3-dimensional (3D) structure offered by Cryo-EM. In AFM imaging, we cannot capture the object from different angles as Cryo-EM does. Therefore, AD values measured in 2D AFM images do not accurately represent the axial distance between two adjacent protomers along the same actin filament. Consequently, we relied on MAD values. Our results, including the fluctuation in the number of protomer pairs per HHP, are further supported by other studies (see our Discussion in lines 555-560).

      Additionally, Figure 3E indicates a first decay constant of 0.14 seconds, substantially shorter than the frame rate (0.5 sec/frame). This suggests significant variations in line profiles between frames, attributable either to overly rapid dynamics or a low signal-to-noise ratio. Implementing running frame averages (of 2-3 frames) is recommended to distinguish between these scenarios. If the dynamics are indeed fast, the averaged frame's line profile may degrade, complicating peak identification. Conversely, if poor signal-to-noise ratio is the cause, averaging frames could facilitate peak detection. In the latter case, the authors can find the optimal number of frame averages and obtain better line profiles with fewer missing and pseudo-peaks.

      We utilized state-of-the-art HS-AFM with high temporal and spatial resolution to capture the dynamic structures of F-ADP-actin and F-ADP.Pi-actin segments at higher frame rate of 0.2 sec/frame and 0.1 sec/frame, respectively (Figure supplement 3). As suggested, we implemented running frame averages (3 frames) in the ACF analyses. Consistently, our results indicate that the first time constant (t1) remains around 0.1-0.4 seconds, independent of the imaging rates (0.1 – 0.5 sec/frame), for AD between two adjacent actin protomers in F-actin bound with ADP or ADP.Pi (Table Supplement 5), and in the similar range of (t1), shown in Figure 3E. These significant experimental results support the notion that helical twists, the number of actin protomers per HHP, and MAD in bare F-actin segments, are intrinsically dynamic and fluctuate around the mean values over time (see further in lines 264-270; 333-337; and 555-560). It should be noted that our original ACF analyses did not include the averaging of running frames, thus eliminating the possibility of low signal/noise ratio in our analysis, as shown in Figure 3E-F.

      Discussions

      The authors suggest a strong link between the C-form of actin and the formation of a short pitch helix. However, Oda et al. (3) have demonstrated that the C-form is highly unstable in the absence of cofilin binding, casting doubt on the possibility of the C-form propagating without cofilin binding. Moreover, in one strand of the cofilactin, interactions between actin subunits are limited to those between the inner domains (ID-ID interactions), which are quite similar to the interactions observed in bare actin filaments. This similarity implies that ID-ID interactions alone are insufficient to determine the helical parameters, suggesting that the presence of cofilin is essential for the formation of the short pitch helix in the cofilactin filament. Thus, crossover repeats are not necessarily shortened even if the actin form is C-form.

      We have experimentally observed a shortened bare half helix adjacent to cofilin clusters on the PE side at high spatial resolution, comprising fewer protomers than normal half helices. Thus, we hypothesized that crossover repeats are shortened if the actin protomers in the bare half helix neighboring the cofilin cluster on the PE side resembles a C-actin structure. This assumption is further explained by referring to C-actin structure in Figure 2 and Figure supplement 2. Even though the C-form, as suggested in Oda et al., 2019, is unstable, it intrinsically fluctuates around the mean value over time and adopts various conformations. A single PDB structure resolved by Cryo-EM through the ensembles of averaging structural images should be referenced as a single atomistic structure, one of many possible conformations, regardless it is resolved by Cryo-EM, X-ray diffraction or crystallography, or NMR (see Figure 1, legend of Figure supplement 1).

      We highlight two main points regarding this issue: (1) The short helical pitch at the global scale is associated with the twisting of the OD at the local scale for individual protomers; (2) Actins in different nucleotide or cofilin bound states exhibit varying ranges, distributions, spectra, variations of both local OD twist and global helical pitch (Figure 1-2, Figure supplement 1-2). The first point underscores that the twist/untwist of the OD determines the shortness of the helical pitches, rather than the ID-ID interactions. The latter point is more related to the global length of the filament. The minimal change in local ID-ID interactions results in an unchanged global length of actin filaments in both cofilin-bound and unbound cases (see pseudo AFM images in Figure supplement 2 for canonical actin filament and cofilactin segments with the same length (comprising 62 protomers). However, filament’s twists, as experimentally detected by EM, high-resolution interferometric scattering microscopy (iSCAT), HS-AFM, and in pseudo AFM, are changeable (see lines 555-560) and independent on the ID-ID interactions.

      Narita (4) proposes that the facilitation of cofilin binding may occur through a shortening in the helix pitch, independent of a change to the C-form of actin. Furthermore, the dissociation of the D-loop from an adjacent actin subunit leads directly to the transition of actin to the G-form, which is considered the most stable configuration for the actin molecule (3).

      See also our explanation above. We have incorporated these points in a Discussion section. See lines 497-499; 510-511.

      Furthermore, our PCA analysis indicates that the transition from C-actin to G-actin necessitates the opening of the nucleotide cleft (resulting in a decrease in PC1) and is more readily achieved than the direct transition from F-actin to G-actin (which requires decreases in both PC1 and PC2). Whether this transition is directly triggered by the dissociation of the D-loop remains a topic for our future investigations. Our PCA analysis reveals that the D-loop is deeply buried within the core of the filament (Figure 2). Further experiments will be conducted to elucidate its roles.

      The mechanism by which the shortened pitch propagates remains a critical and unresolved issue. It appears that this propagation is not a result of the C-form's propagation but likely involves an unidentified mechanism. Identifying and understanding this mechanism represents an essential direction for future research.

      It's worth mentioning that our HS-AFM data and spatial ACF analysis lend support to a hypothesis suggesting that 2-4 bare actin protomers adjacent to cofilin clusters on the PE side adopt C-actin-like structures. Additionally, we have proposed several hypotheses aimed at better understanding the mechanisms driving the unidirectional binding and expansion of cofilin clusters toward the PE side. These hypotheses will require further examination in future experiments. Additional information can be found in lines 328-329; 344-351; and 416-430.

      (1) K. X. Ngo et al., a, Cofilin-induced unidirectional cooperative conformational changes in actin filaments revealed by high-speed atomic force microscopy. eLife 4, (2015).<br /> (2) K. Tanaka et al., Structural basis for cofilin binding and actin filament disassembly. Nature communications 9, 1860 (2018).<br /> (3) T. Oda et al., Structural Polymorphism of Actin. Journal of molecular biology 431, 3217-3228 (2019).<br /> (4) A. Narita, ADF/cofilin regulation from a structural viewpoint. Journal of muscle research and cell motility 41, 141-151 (2020).

      We have cited them accordingly in the paper.

      Reviewer #2 (Public Review):

      Summary:

      This study by Ngo et al. uses mostly high-speed AFM to estimate conformational changes within actin filaments, as they get decorated by cofilin. The authors build on their earlier study (Ngo et al. eLife 2015) where they used the same technique to monitor the expansion of cofilin clusters on actin filaments, and the propagation of the associated conformational changes in the filament (reduction of the helical pitch). Here, they propose a higher-resolution description of the binding of cofilin to actin filaments.

      Strengths:

      The high speed AFM technique used here is quite original to address this question, compared to classical light and electron microscopy techniques. It can certainly bring valuable information as it provides a high spatial resolution while monitoring live events. Also, in this paper, a nice effort was made to make the 3D structures and conformational changes clear and understandable.

      We are grateful for the positive feedback from Reviewer 2.

      Weaknesses:

      The paper also has a number of limitations, which I detail below.

      In addition to AFM, the authors also propose a Principal Component Analysis (PCA) of exisiting structural data on actin protomers. However, this part seems very similar to another published work by others (Oda et al. JMB 2019), which is not even cited.

      We addressed this issue and explained it in Methods section, lines 612-621.

      The asymmetrical growth of cofilin clusters has so far only been seen using AFM, by the same authors (Ngo et al. eLife 2015). Using fluorescent microscopy, others have reported a very symmetrical expansion of cofilin clusters (Wioland et al. Curr Biol 2017). This is not mentioned at all, here. It should be discussed, and explanations for this discrepancy could be proposed.

      We have cited this paper (Wioland et al. Curr Biol 2017) in the current manuscript (see lines 361-362). However, we are unable to evaluate the technical distinctions between our methods and theirs. Instead, we have referred to a more recent paper that employed similar techniques to those used by Wioland et al. in Current Biology 2017. Our findings align with those reported by Bibeau JP et al. in the Journal of Molecular Biology 2021 (see their Results on page 7, titled “Cofilin clusters elongate preferentially towards the actin filament pointed end”. At the minimum, we believe this is appropriate.

      Regarding the AFM technique, I have the following concerns.

      The filaments appear densely packed on the surface, and even clearly in register in some images (if not most images, e.g., Figs 3A, 4BC, 5A). Why is that? Isn't there a risk that this could affect the result? This suggests there is some interaction between the filaments.

      In this study, as well as in many similar studies of actin filaments alone or in interaction with other actin binding proteins (ABPs) including cofilin, we have carefully considered the density of filaments when designing experiments. We used highly dense, but not packed, actin filaments to minimize free space between filaments and the surface, which helps maintain stable tip-scanning during AFM imaging. This strategy technically allows us to capture high spatial and temporal resolutions of actin filaments’ structures.

      The actin filaments, resemble paracrystal structures, are represented as densely packed actin filaments (see our data in Ngo and Kodera et al., eLife 2015, Figure 1C). Thus, the data presented in this paper is technically appropriate and does not risk misinterpretation due to lateral interactions impacting the structures and function of actin filaments and cofilin.

      The properties of the lipid layer and its interaction with the actin filaments are not clear at all. A poor control of these interactions is a problem if one aims to measure conformational changes at high resolution. The strength of the interaction appears tuned by the ratio of lipids put on the surface to change its electrostatic charge. A strong attachement likely does more than suppress torsional motion (as claimed in Fig 8A). It may also hinder cofilin binding in several ways (lower availability of binding sites on the filament facing the surface, electrostatic interactions between cofilin and the surface, etc.)

      We are confident that our lipid membrane bilayer is the optimal choice for immobilizing actin filaments in a controlled manner for HS-AFM experiments, achieved through the variation of positively charged lipids. In this study, we have fine-tuned the surface charge for our specific purposes.

      As an example, to capture high-spatial resolution images of actin structures (Figure 5-6, Figure supplement 5B, 6), we strongly fixed the filaments on DPPC/DPTAP (50/50 wt%) after the binding reaction between actin filaments and cofilin in solution was completed. This experiment yielded valuable information, including: (i) the ability to replicate the conformation of cofilactin and hybrid cofilactin/bare actin segments in solution, akin to the first steps in sample preparation for Cryo-EM techniques; and (ii) the capability to capture these structures, reflecting their solution states, by firmly fixing them on a lipid surface. On the lipid surface, these structures were retained stably during AFM imaging.

      If there is a choice, we advise against using amino-silane and other positively charged polymers typically used for modifying glass surfaces to fix actin filaments in studies using fluorescence microscopy. The strong immobilization by these chemicals can alter the structural dynamics and functions of actin filaments, lead to non-specific binding of cofilin on the modified glass surface, and potentially affect data interpretation.

      On a local scale, the reviewer may argue about the "lower availability of binding sites on the filament facing the surface". However, on a global scale, we maintain that two single strands forming helical twists of long F-actin segments should have an equal chance to bind cofilin even when fixed on a lipid membrane. The evidence shown in Figure 8A and Video 7, which demonstrates that small cofilin clusters associate and dissociate locally without developing into large clusters along the actin filament, supports our conclusion that flexibility and dynamics in helical twists plays a crucial role in facilitating the binding and growth of cofilin clusters.

      The lipid surface utilized in our study with actin filaments and cofilin provides an ideal surface, as it is flat and minimizes the nonspecific binding of cofilin to the lipid membrane (see an example of the lipid surface in Video 5).

      How do we know that the variations over time are not mostly experimental noise, i.e. variations between repeats of the same measurement? As shown in Fig 3, correlation is mostly lost from one image to the next, and rather stable after that.

      This question is similar to the above question of Reviewer 1. Please also refer to our response in lines 264-270; 333-337; 555-560, measurement Methods, and Figure supplement 3 and Table supplement 5.

      The identification of cofilactin regions relies on the additional height of the "peaks", due to the presence of cofilin. It thus seems that cofilin is detected every half helical pitch (HHP), but not in between, thereby setting the resolution for the localization of cluster borders to one HHP. It thus seems difficult to claim that there is a change in helicity without cofilin decoration over this distance. In Fig 7, the change in helicity could be due to cofilin decoration that is undetected because cofilins have not yet reached the next peak.

      There are several important criteria to distinguish the "supertwisted half helix" in cofilactin region from the "normal half helix". As illustrated in the pseudo AFM images constructed for normal F-actin and C-actin segments (with and without cofilin decoration) from PDB structures, it is evident that these two structures differ significantly in length and the number of protomer pairs per HHP (see Figure Supplement 2). In both pseudo and experimental AFM images, these parameters can be easily detected by measuring the distance between two cross-over points. Furthermore, the height or thickness difference between the cofilactin and bare actin regions is approximately 10-15 Å, which is well resolved by HS-AFM due to its exceptional z-axis resolution of ~1 Å. Technically, we were able to detect these differences by creating a longitudinal section profile that covered both bare actin and cofilactin areas, as shown in Figure supplement 6.

      We experimentally reveal that a critical cofilin cluster comprising 2-4 molecules (Figures 5-6) or larger cofilin clusters (Figures 7-8, Figure Supplements 6-8) could equally supertwist a bare half helix on the PE side. The observation that a small cofilin cluster (2-4 molecules) can shorten a half helix by reducing number of protomers per HHP to 9 or 11 (4.5 or 5.5 protomer pairs), which typically requires full decoration by 9-11 cofilin molecules, strongly suggests that supertwisting or the change in helicity does not always require complete cofilin decoration. We predicted that 2-4 bare actin protomers neighboring a cofilin cluster on the PE side can adopt the C-actin-like structure. See further in lines 324-329.

      Figure 7 captures a live binding event of cofilin at low spatial resolution, yet (i) the half helical pitches and (ii) the thickness of the cofilactin and bare actin segments can still be clearly distinguished. This demonstrates that changes in helicity within the cofilactin region propagate to an unbound half helix on the PE side, rearranging the helical twist by reducing the number of actin protomers per HHP, prior to recruiting additional cofilin for binding and expanding clusters.

      Reviewer #1 (Recommendations For The Authors):

      I believe C-form and G-form are better than C-actin like structure or G-actin like structure.

      We avoid using terms like "G-form", "F-form", or "C-form", as defined by Cryo-EM (Oda et al., 2019), because they refer to specific nucleotide and cofilin-bound states in other original papers. Instead, we use “G-actin”, “F-actin”, “C-actin”, “G-actin-like”, and “C-actin-like” to emphasize "Structural Dynamics" and "Structural Polymorphism". This highlights that even F-actin structures without cofilin bound can adopt "C-actin-like" conformations with fewer OD twists, resulting in a shorter global helical pitch. ADP-bound F-actins exhibit greater variability in helical twists than ADP-Pi-bound F-actin (Figure 9), indicating that ADP-bound F-actin protomers can adopt more C-actin-like conformations than ADP-Pi-bound F-actin protomers (Figure 1, Figure supplement 1).

      Technical terms describing actin structures do not need to be the same between Cryo-EM and HS-AFM, as the two techniques are fundamentally different. Our work underscores the importance of considering “structural dynamics and heterogeneity” in different nucleotide states of filamentous actin structures, both with and without cofilin, over time.

      Figure 1A

      A very similar analysis has already been performed by Oda et al (1). The authors should describe the relationships with the previous analysis.

      We addressed this issue in Methods – Principal component analysis – in lines 612-621.

      Figure 1B, C

      A very similar analysis has already been performed by Tanaka et al. (2). The authors should describe the relationship with the previous analysis.

      We addressed this issue in Methods – Principal component analysis – in lines 612-621 and legend of Figure 1.

      Lines 397-398

      "However, we noted that in rare instances, cofilin clusters also grew on both sides in the regular bare half helices when ATP or ADP was present."

      I believe other experiments also contain ATP in the solution. I could not catch the meaning of this sentence.

      We addressed this issue in the Results section, line 412. "However, we noted that in rare instances, cofilin clusters also grew on both sides in the regular bare half helices when only ADP was present."

      Additionally, we enhanced the description in the Methods section to avoid any confusion regarding nucleotides in the buffer. Please refer to the Methods section under “HS-AFM imaging”, lines 702-738.

      Lines 427-429

      "Consequently, the proportion of naturally supertwisted half helices with HHPs shorter than 30 nm was 5.8% for F-ADP-actin but only 1.1% and 0.2% for F-ADP.Pi-actin and phalloidin-stabilized F-actin, respectively."<br /> Similar discussion was made in (3) for the actin filaments with tension. It might be comparable with the current data.

      We cited it accordingly, line 447 for Okura et al., 2023.

      Lines 553-557

      "Nonetheless, it remains plausible that the structural flexibility exhibited 553 by ADP-bound actin protomers could result in subtle variations in the conformations of the DNase binding loop (Dloop) G46-M47-G48-N49, as suggested in (Chou and Pollard, 2019). We suggest that the absence of bound Pi possibly increases the torsional flexibilities during helical twisting of ADP bound actin filaments in contrast to their ADP.Pi-bound counterparts."

      The crystal structure of the F-form (4) showed that Pi in ADP.Pi connects the two large domains of the actin molecule, stabilizing F-form. Pi release largely weakens the connection. This might be useful for the discussion.

      We incorporated this point with the suggested citation in lines 582-584.

      (1) T. Oda et al., Structural Polymorphism of Actin. Journal of molecular biology 431, 3217-3228 (2019).

      (2) K. Tanaka et al., Structural basis for cofilin binding and actin filament disassembly. Nature communications 9, 1860 (2018).

      (3) K. Okura et al., Mechanical Stress Decreases the Amplitude of Twisting and Bending Fluctuations of Actin Filaments. Journal of molecular biology 435, 168295 (2023).

      (4) Y. Kanematsu et al., Structures and mechanisms of actin ATP hydrolysis. Proceedings of the National Academy of Sciences of the United States of America 119, e2122641119 (2022).

      Reviewer #2 (Recommendations For The Authors):

      Line 190: "Noticeably, PCA analysis revealed higher structural flexibility in F-ADP-actin (red dots), exploring a larger space than F-ADP-Pi-actin structures (orange dots) within the F-actin cluster (inset in Figure 1A)". Is there a quantification to support this claim? Visually, things are not so clear.

      We have improved Figure 1 by adding 2 circles to an inset, providing clearer quantification to support our claim.

      In the PCA part: isn't it a bit obvious, or at least expected, that the conformation adopted by actin in the cofilactin structure is the most favorable one for binding cofilin?

      We agree this point with the reviewer and have added this point accordingly in the Results section, lines 202-204.

      I found it a bit unclear how the structures in Fig 2 were obtained.

      We further explained it by adding “Zoom-in views of these long filaments are shown in Figure 2” in Methods section, line 661.

      In the AFM images, the authors always seem to know the polarity of the filaments. Unless I missed it, how they know this is not explained. In their earlier work (Ngo et al. 2015) they used a subfragment of myosin II which indicates polarity when bound to F-actin. I found no such explanation here.

      We have addressed this issue in the legend of each figure accordingly.

      For clarity, I suggest writing "C-actin-like structures" (with two hyphens) rather than "C-actin like structures".

      We agree and are currently incorporating this change in the text.

      The term "cluster" in PCA can be confusing because it is used for cofilin clusters throughout the text.

      "Cluster" is a common term used in PCA analysis. To clarify, we revised the legend in Figure 1 and Figure Supplement 1, changing "PCA clusters" to distinguish them from “cofilin clusters” or “F-actin clusters”.

      There are many acronyms. Readibility of the figure legends (which can be consulted independently from the main text) would be improved if acronyms were explicited there as well.

      We have revised some of the acronyms in the legend of each figure accordingly. At the minimum, we believe it is appropriate.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Reviewer #1 (Public Review):

      Key shortcomings include the unusual normalization strategies used for many experiments and the lack of quantification/statistical analyses for several experiments.

      In the updated version of the paper, we have addressed all of this reviewer's criticisms. Most importantly, we have performed several additional experiments to address the concern that unusual normalization strategies were used in our paper and that quantification and statistical analyses were lacking for several experiments. We have now analyzed the full set of release conditions for Shh and engineered proteins from Disp-expressing n.t. control cells and Disp-/- cells both in the presence and absence of Scube2 (Figure 1A'-D', Figure 2E added to the paper, Figure 3B'-D', Figure 5C and Figure S2F-H). Previously, we had only quantified protein release from n.t. controls and Disp-/- cells in the presence but not in the absence of Scube2 under serum-depleted conditions. Quantifications of serum-free protein release and Shh release under conditions ranging from 0.05% FCS to 10% FCS were completely missing from the earlier versions of the manuscript, but have now been added to our paper. In addition, we have reanalyzed all of the data sets in the above figures, as well as Figures 2C and S1B, to address the issue of "unusual normalization strategies": unlike previous assays in which the highest amount of protein detected in the media was set to 100% and all other proteins in that experiment were expressed relative to that value, we now directly compare the relative amounts of cellular and corresponding solubilized proteins as a method to quantify release without the need for data normalization (Figs. 1A'-D', 2C,E, 3B'-D', E, 5C, Fig. S1B, S2F-H).

      We have also repeated the qPCR analyses in C3H10T1/2 cells and now show that the same Shh/C25AShh activities can be observed when using another Shh responsive cell line, NIH3T3 cells (Fig. 4B, 6B, fig. S5B).

      We would like to point out that if the criticism refers to the presentation of our RP-HPLC and SEC data, the normalization of the strongest eluted protein signal to 100% for all proteins tested is necessary to put their behavior in a clearer relationship. This is because only the relative positions of protein elution, and not their amounts, are important in these experiments.

      The significance of the data provided is overstated because many of the presented experiments confirm/support previously published work.

      To mitigate the first reviewer's comment that the significance of the data presented is overstated, we now clearly distinguish between our novel results and the known aspect of Hh release on lipoproteins throughout our paper. We now clearly describe what is new and important in our paper: First, contrary to the general perception in the field, Disp and Scube2 are not sufficient to solubilize Shh, casting doubt on the currently accepted model that Scube2 accepts dual-lipidated Shh from Disp and transports it to the receptor Ptch. Second, lipoproteins shift dual Shh processing to N-terminal peptide processing only to generate different soluble Hh forms with different activities (as shown in Figure 4C). Third, and again contrary to popular belief, this new release mode does not inactivate Shh, as we now show in two established cellular assays for Hh biofunction (Figures 4A-C, 5B'', 6B and S5C-G). Fourth, and most importantly, we show that spatiotemporally controlled, Disp-, Scube2- and HDL-mediated Shh release absolutely requires dual lipidation of the membrane-associated Shh precursor prior to its release. This finding (as shown in Figures 1 and S2) changes the interpretation of previously published in vivo data that have long been interpreted as evidence for the requirement of dual Shh lipidation for full receptor binding and activation.

      The study provides a modest advance in our understanding of the complex issue of Shh membrane extraction.

      Although we agree that our results integrate our novel observations into previously established concepts of Hh release and trafficking, we also hope that our data cast well-founded doubt on the current view that the issue of Hh release and trafficking is largely resolved by the model of Disp-mediated Shh hand-over to Scube2 and then to Ptch, which requires interactions with both Shh lipids. Our data show that this is clearly not the case in the presence of lipoproteins. Thus, the significance of our data is that models of Shh lipid-regulated signaling to Ptch obtained using the dual-lipidated Shh precursor prior to its Disp- and Scube2-mediated conversion into a delipidated or monolipidated, HDL-associated soluble ligand are likely to describe a non-physiological interaction. Instead, our work describes a highly bioactive soluble ligand with only one lipid still attached, which has not been described before in the literature. The in vivo endpoint analyses presented in Fig. S8 suggest that this new protein variant is likely to play an important role during development.

      Reviewer #2 (Public Review):

      The precise molecular identity (of the released Shh) remains to be defined.

      We would like to respond that the direct comparison of soluble proteins and their well-defined double-lipidated precursors side-by-side in the same experiment, as shown in our paper, determines all relevant molecular changes in the Shh release process. Most importantly, we show by SDS-PAGE and RP-HPLC that HDL restricts Shh processing to the N-terminus and that the absence of HDL results in double processing of Shh during its release. We also show by SEC that the C-terminus binds the protein to HDL. In addition, the fly experiments confirm the requirement for N-terminal Hh processing, but not for processing of the C-terminal peptide, and suggest that the N-terminal Cardin-Weintraub sequence replaced by the functionally blocking tag represents the physiological cleavage site.

      It would be important to demonstrate key findings in cells that secrete Shh endogenously.

      We now confirm the key findings of our study in Panc1 cells that endogenously produce and secrete Shh: As shown in Fig. S1D, we find that soluble proteins are processed but retain the C-cholesterol, which we now directly confirm by RP-HPLC (Fig. S4F-H). The in vivo analyses shown in Fig. S8 suggest that the key finding - that N-terminal but not C-terminal Hh shedding is required for release - can be supported, at least in the fly: here, Hh variants impaired in their ability to be processed N-terminally strongly repress the endogenous protein, whereas the same protein impaired in its ability to be processed C-terminally does not.

      The authors detect Shh variants that are expressed independently of Disp and Scube2 in secretion assays, but are excluded from interpretation as experimental artifacts.

      We agree with the reviewer's criticism that the amounts of Shh released independently of Disp and Scube2 in secretion assays were not quantified and analyzed statistically to justify their proposed status as not physiologically relevant. We now show that these forms are indeed secretion artifacts (Fig. 3E and Fig. S2F-H show quantification of the lower electrophoretic mobility protein fraction (i.e., the "top" band representing the double-lipidated soluble protein fraction)) because this fraction is released independently of Disp and Scube2.

    2. eLife assessment

      This useful manuscript presents an analysis of different factors that are required for release of the lipid-linked morphogen Shh from cellular membranes. The evidence is still incomplete, as experiments rely on over-expression of Shh in a single cell line and are sometimes of a correlative nature. The study, which otherwise confirms and extends previous findings, will be of interest to developmental biologists who work on Hedgehog signaling.

    3. Reviewer #1 (Public Review):

      This manuscript presents a model in which combined action of the transporter-like protein DISP and the sheddases ADAM10/17 promote shedding of a mono-cholesteroylated Sonic Hedgehog (SHH) species following cleavage of palmitate from the dually lipidated precursor ligand. The authors propose that this leads to transfer of the cholesterol-modified SHH to HDL for solubilization. The minimal requirement for SHH release by this mechanism is proposed to be the covalently linked cholesterol modification because DISP could promote transfer of a cholesteroylated mCherry reporter protein to serum HDL. The authors used an in vitro system to demonstrate dependency on DISP/SCUBE2 for release of the cholesterol modified ligand. These results confirm previously published results from other groups (PMC3387659 and PMC3682496).

      A strength of the work is the use of a bicistronic SHH-Hhat system to consistently generate dually-lipidated ligand to determine the quantity and lipidation status of SHH released into cell culture media.

      Key shortcomings include the unusual normalization strategies used for many experiments and the lack of quantification/statistical analyses for several experiments. Due to these omissions, it is difficult to conclude that the data justify the conclusions. The significance of the data provided is overstated because many of the presented experiments confirm/support previously published work. The study provides a modest advance in understanding of the complex issue of SHH membrane extraction.

    4. Reviewer #2 (Public Review):

      Ehring et al. analyze contributions of Dispatched, Scube2, serum lipoproteins and Sonic Hedgehog lipid modifications to the generation of different Shh release forms. Hedgehog proteins are anchored in cellular membranes by N-terminal palmitate and C-terminal cholesterol modifications, yet spread through tissues and are released into the circulation. How Hedgehog proteins can be released, and in which form, remains controversial. The authors systematically dissect contributions of several previously identified factors, and present evidence that Disp, Scube2 and lipoproteins concertedly act to release a novel Shh variant that is cholesterol-modified but not palmitoylated. The results provide new insights into the function of Disp and Scube2 in Hedgehog release. The findings concerning the function of lipoproteins and cholesterol in Hedgehog release are largely confirmatory (PMID 23554573, 20685986). However, in light of the multitude of competing models for Hedgehog release, the present study is a valuable contribution that provides further insights into the relevance of lipoproteins in this process.

      A novel and surprising finding of the present study is the differential removal of Shh N- or C-terminal lipid anchors depending on the presence of HDL and/or Disp. In particular, the identification of a non-palmitoylated but cholesterol-modified Shh variant that associates with lipoproteins is potentially important. The authors use RP-HPLC and defined controls to assess the properties of processed Shh forms, but their precise molecular identity remains to be defined. A caveat is the strong reliance on over-expression of Shh in a single cell line. The authors detect Shh variants that are released independently of Disp and Scube2 in secretion assays, which however are excluded from interpretation as experimental artifacts. Thus, it would be important to demonstrate key findings in cells that secrete Shh endogenously.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review): 

      Summary: 

      This manuscript builds upon the authors' previous work on the cross-talk between transcription initiation and post-transcriptional events in yeast gene expression. These prior studies identified an mRNA 'imprinting' phenomenon linked to genes activated by the Rap1 transcription factor (TF), a surprising role for the Sfp1 TF in promoting RNA polymerase II (RNAPII) backtracking, and a role for the non-essential RNAPII subunits Rpb4/7 in the regulation of mRNA decay and translation. Here the authors aimed to extend these observations to provide a more coherent picture of the role of Sfp1 in transcription initiation and subsequent steps in gene expression. They provide evidence for (1) a physical interaction between Sfp1 and Rpb4, (2) Sfp1 binding and stabilization of mRNAs derived from genes whose promoters are bound by both Rap1 and Sfp1 and (3) an effect of Sfp1 on Rpb4 binding or conformation during transcription elongation. 

      Strengths: 

      This study provides evidence that a TF (yeast Sfp1), in addition to stimulating transcription initiation, can at some target genes interact with their mRNA transcripts and promote their stability. Sfp1 thus has a positive effect on two distinct regulatory steps. Furthermore, evidence is presented indicating that strong Sfp1 mRNA association requires both Rap1 and Sfp1 promoter binding and is increased at a sequence motif near the polyA track of many target mRNAs. Finally, they provide compelling evidence that Sfp1-bound mRNAs have higher levels of RNAPII backtracking and altered Rpb4 association or conformation compared to those not bound by Sfp1. 

      Weaknesses: 

      The Sfp1-Rpb4 association is supported only by a two-hybrid assay that is poorly described and lacks an important control. Furthermore, there is no evidence that this interaction is direct, nor are the interaction domains on either protein identified (or mutated to address function). 

      Indeed, our two hybrid, immunoprecipitation and imaging results do not allow us to conclusively discern whether the interaction between Rpb4 and Sfp1 is direct or indirect. While the interaction holds significance, we consider the direct versus indirect distinction to be of secondary importance in the context of this paper. In the current text we indicated that 'our two hybrid, immunoprecipitation and imaging results do not differentiate between a direct or indirect interactions' (see page 6, sentences highlighted in blue)

      The contention that Sfp1 nuclear export to the cytoplasm is transcription-dependent is not well supported by the experiments shown, which are not properly described in the text and are not accompanied by any primary data. 

      This section has been re-written for better clarity (see page 7). We note that this assay was originally developed and published by Lee, M. S., M. Henry, and P. A. Silver in their 1996 paper in G&D and has since been reported in numerous subsequent studies. Reassuringly, our conclusion is bolstered by the observation that Sfp1 binds to Pol II transcripts co-transcriptionally, suggesting that Sfp1 is exported in the context of the mRNA.

      The presence of Sfp1 in P-bodies is of unclear relevance and the authors do not ask whether Sfp1-bound mRNAs are also present in these condensates. 

      P-bodies consist of both RNA and proteins (reviewed in doi: 10.1021/acs.biochem.7b01162). The significance of this experiment lies in its contribution to further confirming the co-localization of Sfp1 with mRNAs and Rpb4. This observation could also yield valuable insights for future investigations into the role of Sfp1.

      Further analysis of Sfp1-bound mRNAs would be of interest, particularly to address the question of whether those from ribosomal protein genes and other growth-related genes that are known to display Sfp1 binding in their promoters are regulated (either stabilized or destabilized) by Sfp1. 

      Fig. 4A, C and D show that RP mRNAs become destabilized in sfp1Δ cells.

      The authors need to discuss, and ideally address, the apparent paradox that their previous findings showed that Rap1 acts to destabilize its downstream transcripts, i.e. that it has the opposite effect of Sfp1 shown here. 

      We would like to thank Reviewer 1 for this valuable comment. In the revised paper, we delved into our hypothesis suggesting that Rap1 is likely responsible for regulating the imprinting of other proteins, that, in turn, lead to the destabilization of mRNAs, such as Rpb4. See blue paragraph in page 20.

      Finally, recent studies indicate that the drugs used here to measure mRNA stability induce a strong stress response accompanied by rapid and complex effects on transcription. Their relevance to mRNA stability in unstressed cells is questionable. 

      Half-lives were determined mainly by the GRO analysis of optimally proliferating cells. This  method does not requires any drug or stressful treatment.  The results obtained by this method were consistent with those obtained after thiolutin addition. Using both methods, we discovered that disruption of Sfp1 results in substantial mRNA destabilization. Nevertheless, in our revised manuscript, we show results obtained by subjecting cells to a temperature shift to 42°C, a natural method to inhibit transcription. This approach to determine half-lives has been previously reported in our publications, such as Lotan et al. (2005, 2007) and Goler Baron et al. (2008). This may rule out effects of the drug on half-lives. Indeed, this assay clearly determine HL under heat stress. Thus it can clearly demonstrate that, at least during heat shock, Sfp1 stabilizes mRNAs. Since the results are similar to those obtained by the GRO method at 30oC, we concluded that Sfp1 stabilizes mRNA under optimal and hot conditions.

      Reviewer #2 (Public Review): 

      Summary: 

      The manuscript by Kelbert et al. presents results on the involvement of the yeast transcription factor Sfp1 in the stabilisation of transcripts whose synthesis it stimulates. Sfp1 is known to affect the synthesis of a number of important cellular transcripts, such as many of those that code for ribosomal proteins. The hypothesis that a transcription factor can remain bound to the nascent transcript and affect its cytoplasmic half-life is attractive, but the methods used to demonstrate the half-life effects and the association of Sfp1 with cytoplasmic transcripts remain to be fully validated, as explained in my comments on the results below: 

      Comments on methodology and results: 

      (1) A two-hybrid-based assay for protein-protein interactions identified Sfp1, a transcription factor known for its effects on ribosomal protein gene expression, as interacting with Rpb4, a subunit of RNA polymerase II. Classical two-hybrid experiments depend on the presence of the tested proteins in the nucleus of yeast cells, suggesting that the observed interaction occurs in the nucleus. Unfortunately, the two-hybrid method cannot determine whether the interaction is direct or mediated by nucleic acids. 

      Indeed, our two hybrid, immunoprecipitation and imaging results do not allow us to conclusively discern whether the interaction between Rpb4 and Sfp1 is direct or indirect. While the interaction holds significance, we consider the direct versus indirect distinction to be of secondary importance in the context of this paper. In the current text we indicated that 'our two hybrid, immunoprecipitation and imaging results do not differentiate between a direct or indirect interactions' (see page 6)

      (2) Inactivation of nup49, a component of the nuclear pore complex, resulted in the redistribution of GFP-Sfp1 into the cytoplasm at the temperature non-permissive for the nup49-313 strain, suggesting that GFP-Sfp1 is a nucleo-cytoplasmic shuttling protein. This observation confirmed the dynamic nature of the nucleo-cytoplasmic distribution of Sfp1. For example, a similar redistribution to the cytoplasm was previously reported following rapamycin treatment and under starvation (Marion et al., PNAS 2004). In conjunction with the observation of an interaction with Rpb4, the authors observed slower nuclear import kinetics for GFP-Sfp1 in the absence of Rpb4 when cells were transferred to a glucose-containing medium after a period of starvation. Since the redistribution of GFP-Sfp1 was abolished in an rpb1-1/nup49-313 double mutant, the authors concluded that Sfp1 localisation to the cytoplasm depends on transcription. The double mutant yeast cells may show a variety of non-specific effects at the restrictive temperature, and whether transcription is required for Sfp1 cytoplasmic localisation remains incompletely demonstrated. 

      We agree with Reviewer 2 that any heat inactivation of a temperature-sensitive (ts) protein can lead to non-specific effects. It is evident that nup49-313 does not prevent Sfp1 export to the cytoplasm. In the case of rpb1-1, these non-specific effects are expected due to transcriptional arrest, which can eventually result in a reduction in protein content. However, this process takes some time, while the impact on export is more rapid. It is worth noting that this assay was developed and previously published by Pam Silver (Henry and Silver G&D 1996) and has been reported in many subsequent papers. Importantly, our conclusion is supported by the observation that Sfp1 binds both nascent RNA (co-transcriptionally) and mature mRNA (cytoplasmic). These observations, along with the reduced mRNA export upon transcription blocking, are consistent with our proposal that Sfp1 is exported in association with mRNA.

      (3) Under starvation conditions, which led to the presence of Sfp1 in the cytoplasm and have previously been correlated with a decrease in the transcription of Sfp1 target genes, the authors observed that a plasmid-based expressed GFP-Sfp1 accumulated in cytoplasmic foci. These foci were also labelled by P-body markers such as Dcp2 and Lsm1. The quality of the microscopic images provided does not allow to determine whether Rpb4-RFP colocalises with GFP-Sfp1. 

      The submitted PDF figure is of low quality. We believe that high quality figure of the final submission is convincing. 

      (4) To understand to which RNA Sfp1 might bind, the authors used an N-terminally tagged fusion protein in a cross-linking and purification experiment. This method identified 264 transcripts for which the CRAC signal was considered positive and which mostly correspond to abundant mRNAs, including 74 ribosomal protein mRNAs or metabolic enzyme-abundant mRNAs such as PGK1. The authors did not provide evidence for the specificity of the observed CRAC signal, in particular, what would be the background of a similar experiment performed without UV cross-linking. In a validation experiment, the presence of several mRNAs in a purified SFP1 fraction was measured at levels that reflect the relative levels of RNA in a total RNA extract. Negative controls showing that abundant mRNAs not found in the CRAC experiment were clearly depleted from the purified fraction with Sfp1 would be crucial to assessing the specificity of the observed protein-RNA interactions. The NON-CRAC+ selected mRNAs were enriched for genes whose expression was previously shown to be upregulated upon Sfp1 overexpression (Albert et al., 2019). The presence of unspliced RPL30 pre-mRNA in the Sfp1 purification was interpreted as a sign of co-transcriptional assembly of Sfp1 into mRNA, but in the absence of valid negative controls, this hypothesis would require further experimental validation.

      We would like to thank Reviewer 2 for bringing this issue up, as it helped us to clarify it in the revised paper.

      First, we emphasized in the Discussion that many CRAC+ genes do not fall into the category of highly transcribed genes. Please see more detailed discussion below.

      Secondly, we examined various features of the 264 genes - classified as CRAC+ - to estimate their specificity and biological significance. Our various experiments revealed that the CRAC+ genes represent a distinct group with many unique features.

      The biological significance of the 264 CRAC+ mRNAs was demonstrated by various experiments; all are inconsistent with technical flaws. In fact, all the experiments and analyses that we have pursued indicate the unique nature of the CRAC+ genes. Some examples are:

      (1) Fig. 2a and B show that most reads of CRAC+ mRNA were mapped to specific location – close the pA sites.

      (2) Fig. 2C shows that most reads of CRAC+ mRNA were mapped to specific RNA motif located near the 3’ ends of the mRNAs.

      (3) Most RiBi CRAC+ promoter contain Rap1 binding sites (p= 1.9x10-22), whiles the vast majority of RiBi non-CRAC+  promoters do not. (Fig. 3C).

      (4) Fig. 4A shows that RiBi CRAC+ mRNAs become destabilized due to Sfp1 deletion, whereas RiBi non-CRAC+ mRNAs do not. Fig. 4B shows similar results due to Sfp1 depletion.

      (5) Fig. 6B shows that the impact of Sfp1 on backtracking is substantially higher for CRAC+ than for non-CRAC+ genes. This is most clearly visible in RiBi genes.

      (6) Fig. 7A shows that the Sfp1-dependent changes along the transcription units is substantially more rigorous for CRAC+ than for non-CRAC+.

      (7) In Fig. S4B, the chromatin binding profile of Sfp1 is shown to be different for CRAC+ and non-CRAC+ genes.

      Taken together, the many unique features, in fact, any feature that we examined, indicate the specificity and significance of this group, demonstrating that our CRAC results are biologically significant.

      Most importantly, these genes do not all fall into the category of highly transcribed genes.  On the contrary, as depicted in Figure 6A (green dots), it is evident that CRAC+ genes exhibit a diverse range of Rpb3 ChIP and GRO signals. Furthermore, as illustrated in Figure 7A, when comparing CRAC+ to Q1 (the most highly transcribed genes), it becomes evident that the Rpb4/Rpb3 profile of CRAC+ genes behaves differently from the Q1 group. Evidently, despite the heterogeneous transcription of CRAC+ genes (as mentioned above), the Rpb4/Rpb3 profile decreases more substantially than that of the highly transcribed genes (Q1).  Moreover, despite similar expression levels among all RiBi mRNAs, only a portion of them binds Sfp1.

      Thus, all our results indicate that CRAC+ genes represent biologically significant group, irrespective of the expression of it members. In response to this comment, we included a new paragraph discussing the validity of our conclusions. See page 18, blue paragraph.

      (5) To address the important question of whether co-transcriptional assembly of Spf1 with transcripts could alter their stability, the authors first used a reporter system in which the RPL30 transcription unit is transferred to vectors under different transcriptional contexts, as previously described by the Choder laboratory (Bregman et al. 2011). While RPL30 expressed under an ACT1 promoter was barely detectable, the highest levels of RNA were observed in the context of the native upstream RPL30 sequence when Rap1 binding sites were also present. Sfp1 showed better association with reporter mRNAs containing Rap1 binding sites in the promoter region. However, removal of the Rap1 binding sites from the reporter vector also led to a drastic decrease in reporter mRNA levels. Whether the fraction of co-purified RNA is nuclear and co-transcriptional or not cannot be inferred from these results. 

      The proposed co-transcriptional binding of Sfp1 is based on the findings presented in Figure 5C and Figure S2D, as well as the observed binding of Sfp1 to transcripts containing introns, as shown in Figures 2D and 3B.  The results of Fig. 3 led us to the assertion that the "RNA-binding capacity of Sfp1 is regulated by Rap1-binding sites located at the promoter." We maintain our stance on this conclusion. Indeed, the Rap1 binding site does impact mRNA levels, as highlighted by Reviewer 2. However, "construct E," which possesses a promoter with a Rap1 binding site, exhibits lower transcript levels compared to "construct F," which lacks such a binding site in its promoter. Despite this difference in transcript levels, Sfp1 was able to pull down the former transcript but not the latter, even though expression of the former gene is relatively low. Thus, the results appear to be more reliant on the specific capacity of Sfp1 to interact with the transcript rather than on the transcript's expression level.

      (6) To complement the biochemical data presented in the first part of the manuscript, the authors turned to the deletion or rapid depletion of SFP1 and used labelling experiments to assess changes in the rate of synthesis, abundance, and decay of mRNAs under these conditions. An important observation was that in the absence of Sfp1, mRNAs encoding ribosomal protein genes not only had a reduced synthesis rate but also an increased degradation rate. This important observation needs careful validation, as genomic run-on experiments were used to measure half-lives, and this particular method was found to give results that correlated poorly with other measures of half-life in yeast (e.g. Chappelboim et al., 2022 for a comparison). Similarly, the use of thiolutin to block transcription as a method of assessing mRNA half-life has been reported to be problematic, as thiolutin can specifically inhibit the degradation of ribosomal protein mRNA (Pelechano & Perez-Ortin, 2008). Specific repressible reporters, such as those used by Baudrimont et al. (2017), would need to be tested to validate the effect of Sfp1 on the half-life of specific mRNAs. Also, it would be very difficult to infer from the images presented whether the rate of deadenylation is altered by Sfp1.

      Various methods exist for assessing mRNA half-lives (HLs), and each of them carries its own set of challenges and biases. Consequently, it becomes problematic to directly compare HL values of a specific mRNA when different methods are employed. The superiority of one particular method over others remains unclear (in my opinion). However, they exhibit a high degree of reliability when it comes to comparing different strains under the identical conditions using a single method.

      Estimating HLs through the GRO approach is a non-invasive method, applied on optimally proliferating cells, which has been employed in numerous publications. While no method is without its limitations, our experience along the years reassured approach to be among the most dependable. Our HL determination using thiolutin to block transcription provided results that were consistent with the values obtained by the GRO approach.

      Nevertheless, in our revised manuscript, we supplemented the HL data, obtain by thiolutin, with results obtained by subjecting cells to a temperature shift to 42°C, a natural method to block transcription in wild-type (WT) cells. This approach to determine HLs has been previously reported in our publications, such as Lotan et al. (2005, 2007) and Goler Baron et al. (2008). The new results are shown in Fig. S3B. They are consistent with our conclusion that Sfp1 stabilizes mRNAs.

      Using a repressible promoter to determine mRNA HL is, unfortunately, not suitable in this paper because the promoter itself is involved in HL regulation. This observation is supported by Bregman et al. (2011) and depicted in Fig. 3, which illustrates that the promoter is critical for mRNA imprinting, consequently regulating HL.

      (7) The effects of SFP1 on transcription were investigated by chromatin purification with Rpb3, a subunit of RNA polymerase, and the results were compared with synthesis rates determined by genomic run-on experiments. The decrease in polII presence on transcripts in the absence of SFP1 was not accompanied by a marked decrease in transcript output, suggesting an effect of Sfp1 in ensuring robust transcription and avoiding RNA polymerase backtracking. To further investigate the phenotypes associated with the depletion or absence of Sfp1, the authors examined the presence of Rpb4 along transcription units compared to Rpb3. One effect of spf1 deficiency was that this ratio, which decreased from the start of transcription towards the end of transcripts, increased slightly. The results presented are largely correlative and could arise from the focus on very specific types of mRNAs, such as those of ribosomal protein genes, which are sensitive to stress and are targeted by very active RNA degradation mechanisms activated, for example, under heat stress (Bresson et al., 2020). 

      Figure 7A illustrates a significant reduction in Rpb4/Rpb3 ratios along the transcription unit in WT cells. This reduction is notably more pronounced in CRAC+ genes compared to the highly transcribed quartile (Q1), which includes all ribosomal protein (RP) genes, and it is completely absent in sfp1∆ cells. Furthermore, it's important to highlight that the CRAC+ gene group displays a wide range of transcription rates, as measured by either Rpb3 ChIP or GRO (Figure 6A). Given these observations, we do not think that heightened sensitivity of RP mRNA degradation in response to stress is responsible for the pronounced difference in the configuration of the Pol II elongation complex that is detected in CRAC+ genes, mainly because this experiment was performed under standard (non-stress) culture conditions.

      Correlative studies are particularly informative when a gene mutation eliminates a correlation, and this is precisely the type of study depicted in Figure 7B-C. The correlations shown in these panels are dependent on Sfp1. Indeed, RP genes are sensitive to stress. However, we used non-stressed conditions. Furthermore, CRAC+ genes did not display any apparent unusual destabilization but rather exhibited higher (not lower) mRNA stability compared to non-CRAC+ genes (Figure 7C).

    1. Reviewer #2 (Public Review):

      Summary:<br /> The manuscript by Kelbert et al. presents results on the involvement of the yeast transcription factor Sfp1 in the stabilisation of transcripts whose synthesis it stimulates. Sfp1 is known to affect the synthesis of a number of important cellular transcripts, such as many of those that code for ribosomal proteins. The hypothesis that a transcription factor can remain bound to the nascent transcript and affect its cytoplasmic half-life is attractive, but the methods used to demonstrate the half-life effects and the association of Sfp1 with cytoplasmic transcripts remain to be fully validated, as explained in my comments on the results below:

      Comments on methodology and results:<br /> (1) A two-hybrid-based assay for protein-protein interactions identified Sfp1, a transcription factor known for its effects on ribosomal protein gene expression, as interacting with Rpb4, a subunit of RNA polymerase II. Classical two-hybrid experiments depend on the presence of the tested proteins in the nucleus of yeast cells, suggesting that the observed interaction occurs in the nucleus. Unfortunately, the two-hybrid method cannot determine whether the interaction is direct or mediated by nucleic acids.

      (2) Inactivation of nup49, a component of the nuclear pore complex, resulted in the redistribution of GFP-Sfp1 into the cytoplasm at the temperature non-permissive for the nup49-313 strain, suggesting that GFP-Sfp1 is a nucleo-cytoplasmic shuttling protein. This observation confirmed the dynamic nature of the nucleo-cytoplasmic distribution of Sfp1. For example, a similar redistribution to the cytoplasm was previously reported following rapamycin treatment and under starvation (Marion et al., PNAS 2004). In conjunction with the observation of an interaction with Rpb4, the authors observed slower nuclear import kinetics for GFP-Sfp1 in the absence of Rpb4 when cells were transferred to a glucose-containing medium after a period of starvation. Since the redistribution of GFP-Sfp1 was abolished in an rpb1-1/nup49-313 double mutant, the authors concluded that Sfp1 localisation to the cytoplasm depends on transcription. The double mutant yeast cells may show a variety of non-specific effects at the restrictive temperature, and whether transcription is required for Sfp1 cytoplasmic localisation remains incompletely demonstrated.

      (3) Under starvation conditions, which led to the presence of Sfp1 in the cytoplasm and have previously been correlated with a decrease in the transcription of Sfp1 target genes, the authors observed that a plasmid-based expressed GFP-Sfp1 accumulated in cytoplasmic foci. These foci were also labelled by P-body markers such as Dcp2 and Lsm1. The quality of the microscopic images provided does not allow to determine whether Rpb4-RFP colocalises with GFP-Sfp1.

      (4) To understand to which RNA Sfp1 might bind, the authors used an N-terminally tagged fusion protein in a cross-linking and purification experiment. This method identified 264 transcripts for which the CRAC signal was considered positive and which mostly correspond to abundant mRNAs, including 74 ribosomal protein mRNAs or metabolic enzyme-abundant mRNAs such as PGK1. The authors did not provide evidence for the specificity of the observed CRAC signal, in particular, what would be the background of a similar experiment performed without UV cross-linking. In a validation experiment, the presence of several mRNAs in a purified SFP1 fraction was measured at levels that reflect the relative levels of RNA in a total RNA extract. Negative controls showing that abundant mRNAs not found in the CRAC experiment were clearly depleted from the purified fraction with Sfp1 would be crucial to assessing the specificity of the observed protein-RNA interactions. The CRAC-selected mRNAs were enriched for genes whose expression was previously shown to be upregulated upon Sfp1 overexpression (Albert et al., 2019). The presence of unspliced RPL30 pre-mRNA in the Sfp1 purification was interpreted as a sign of co-transcriptional assembly of Sfp1 into mRNA, but in the absence of valid negative controls, this hypothesis would require further experimental validation.

      (5) To address the important question of whether co-transcriptional assembly of Spf1 with transcripts could alter their stability, the authors first used a reporter system in which the RPL30 transcription unit is transferred to vectors under different transcriptional contexts, as previously described by the Choder laboratory (Bregman et al. 2011). While RPL30 expressed under an ACT1 promoter was barely detectable, the highest levels of RNA were observed in the context of the native upstream RPL30 sequence when Rap1 binding sites were also present. Sfp1 showed better association with reporter mRNAs containing Rap1 binding sites in the promoter region. However, removal of the Rap1 binding sites from the reporter vector also led to a drastic decrease in reporter mRNA levels. Whether the fraction of co-purified RNA is nuclear and co-transcriptional or not cannot be inferred from these results.

      (6) To complement the biochemical data presented in the first part of the manuscript, the authors turned to the deletion or rapid depletion of SFP1 and used labelling experiments to assess changes in the rate of synthesis, abundance, and decay of mRNAs under these conditions. An important observation was that in the absence of Sfp1, mRNAs encoding ribosomal protein genes not only had a reduced synthesis rate but also an increased degradation rate. This important observation needs careful validation, as genomic run-on experiments were used to measure half-lives, and this particular method was found to give results that correlated poorly with other measures of half-life in yeast (e.g. Chappelboim et al., 2022 for a comparison). Similarly, the use of thiolutin to block transcription as a method of assessing mRNA half-life has been reported to be problematic, as thiolutin can specifically inhibit the degradation of ribosomal protein mRNA (Pelechano & Perez-Ortin, 2008). Specific repressible reporters, such as those used by Baudrimont et al. (2017), would need to be tested to validate the effect of Sfp1 on the half-life of specific mRNAs. Also, it would be very difficult to infer from the images presented whether the rate of deadenylation is altered by Sfp1.

      (7) The effects of SFP1 on transcription were investigated by chromatin purification with Rpb3, a subunit of RNA polymerase, and the results were compared with synthesis rates determined by genomic run-on experiments. The decrease in polII presence on transcripts in the absence of SFP1 was not accompanied by a marked decrease in transcript output, suggesting an effect of Sfp1 in ensuring robust transcription and avoiding RNA polymerase backtracking. To further investigate the phenotypes associated with the depletion or absence of Sfp1, the authors examined the presence of Rpb4 along transcription units compared to Rpb3. One effect of spf1 deficiency was that this ratio, which decreased from the start of transcription towards the end of transcripts, increased slightly. The results presented are largely correlative and could arise from the focus on very specific types of mRNAs, such as those of ribosomal protein genes, which are sensitive to stress and are targeted by very active RNA degradation mechanisms activated, for example, under heat stress (Bresson et al., 2020).

      Strengths:<br /> - Diversity of experimental approaches used<br /> - Validation of large-scale results with appropriate reporters

      Weaknesses:<br /> - Choice of evaluation method to test mRNA half-life<br /> - Lack of controls for the CRAC results

    1. Reviewer #2 (Public Review):

      In this work, Sarkar et al. investigated the potential ability of adenosine triphosphate (ATP) as a solubilizer of protein aggregates by combining MD simulations and ThT/TEM experiments. They explored how ATP influences the conformational behaviors of Trp-cage and β-amyloid Aβ40 proteins. Currently, there are no experiments in the literature supporting their simulation results of ATP on Trp-cage. The simulation protocol employed for the Aβ40 monomer system is conventional MD simulation, while REMD simulation (an enhanced sampling method) is used for the Aβ monomer + ATP system. It is not clear whether the conformational difference is caused by ATP or by the different simulation methods used. ThT/TEM experiments should be performed on Aβ40 fibrils rather than on Aβ(16-22) aggregates. Moreover, to elucidate their experimental results that ATP can dissolve preformed Aβ fibrils, the authors need to study the influence of ATP on Aβ fibrils instead of on Aβ dimer in their MD simulations. The novelty of this study is limited. The role of ATP in inhibiting Aβ fibril formation and dissolving preformed Aβ fibrils has been reported in previous experimental and computational studies (Journal of Alzheimer's Disease, 2014, 41: 561; Science 2017, 2017, 356, 753-756 J. Phys. Chem. B 2019, 123, 9922−9933; Scientific Reports, 2024, 14: 8134). However, most of those papers are not discussed in this manuscript. Additionally, some details of MD simulations and data analysis are missing in the manuscript, including the initial structures of all the simulations, the method for free energy calculation, the dielectric constant used, etc.

    2. eLife assessment

      The authors combined molecular dynamics simulations and experiments to study the role of ATP as a hydrotrope of protein aggregates. The topic is of major current interest and thus the study potentially makes a useful contribution to the community. In the current form, however, the level of evidence from the computation is considered incomplete, due to several issues such as limited convergence test, analysis, and the very high ATP concentration used in the simulation.

    3. Reviewer #1 (Public Review):

      Summary:

      This work combines molecular dynamics (MD) simulations along with experimental elucidation of the efficacy of ATP as a biological hydrotrope. While ATP is broadly known as the energy currency, it has also been suggested to modulate the stability of biomolecules and their aggregation propensity. In the computational part of the work, the authors demonstrate that ATP increases the population of the more expanded conformations (higher radius of gyration) in both a soluble folded mini-protein Trp-cage and an intrinsically disordered protein (IDP) Aβ40. Furthermore, ATP is shown to destabilise the pre-formed fibrillar structures using both simulation and experimental data (ThT assay and TEM images). They have also suggested that the biological hydrotrope ATP has significantly higher efficacy as compared to the commonly used chemical hydrotrope sodium xylene sulfonate (NaXS).

      Strengths:

      This work presents a comprehensive and compelling investigation of the effect of ATP on the conformational population of two types of proteins: globular/folded and IDP. The role of ATP as an "aggregate solubilizer" of pre-formed fibrils has been demonstrated using both simulation and experiments. They also elucidate the mechanism of action of ATP as a multi-purpose solubilizer in a protein-specific manner. Depending on the protein, it can interact through electrostatic interactions (for predominantly charged IDPs like Aβ40), or primarily van der Waals' interactions through (for Trp-Cage).

      Weaknesses:

      The data presented by the authors are sound and adequately support the conclusions drawn by the authors. However, there are a few points that could be discussed or elucidated further to broaden the scope of the conclusions drawn in this work as discussed below:

      (i) The concentration of ATP used in the simulations is significantly higher (500 mM) as compared to those used in the experiments (6-20 mM) or cellular cytoplasm (~5 mM as mentioned by the authors). Since the authors mention already known concentration dependence of the effect of ATP, it is worth clarifying the possible limitations and implications of the high ATP concentrations in the simulations. It seems ATP can stabilise the proteins at low concentrations, but the current work does not address this possible effect. It would be interesting to see whether the effect of ATP on globular proteins and IDPs remains similar even at lower ATP concentrations.

      (ii) The authors make a somewhat ambitious statement that the role of ATP as a solubilizer of pre-formed fibrils could be used as a therapeutic strategy in protein aggregation-related diseases. However, it is not clear how it would be so since ATP is a promiscuous substrate in several biochemical processes and any additional administration of ATP beyond normal cellular concentration (~5 mM) could be detrimental.

      (iii) A natural question arises about what is so special about ATP as a solubilizer. The authors have also asked this question but in a limited scope of comparing to a commonly used chemical hydrotrope NaXS. However, a bigger question would be what kind of chemical/physical features make ATP special? For example, (i) if the amphiphilic property is important, what about some standard surfactants? (ii) how would ATP compare to other nucleotides like ADP or GTP? It might be useful to explore such questions in the future to further establish the special role of ATP in this regard.

      (iv) In Figure 2F, it seems that in the presence of 0.5 M ATP, the Rg increases (as expected), but the number of native contacts remains almost similar. The reduction in the number of native contacts at higher ATP concentrations is not as dramatic as the increase in Rg. This is somewhat counterintuitive and should be looked into. Normally one would expect a monotonous reduction in the number of native contacts as the protein unfolds (increase in Rg).

    4. Reviewer #3 (Public Review):

      Summary:

      Since its first experimental report in 2017 (Patel et al. Science 2017), there have been several studies on the phenomenon in which ATP functions as a biological hydrotrope of protein aggregates. In this manuscript, by conducting molecular dynamics simulations of three different proteins, Trp-cage, Abeta40 monomer, and Abeta40 dimer at a high concentration of ATP (0.1, 0.5 M), Sarkar et al. find that the amphiphilic nature of ATP, arising from its molecular structure consisting of phosphate group (PG), sugar ring, and aromatic base, enables it to interact with proteins in a protein-specific manner and prevents their aggregation and solubilize if they aggregate. The authors also point out that in comparison with NaXS, which is the traditional chemical hydrotrope, ATP is more efficient in solubilizing protein aggregates because of its amphiphilic nature.

      Trp-cage, featured with a hydrophobic core in its native state, is denatured at high ATP concentration. The authors show that the aromatic base group (purine group) of ATP is responsible for inducing the denaturation of helical motifs in the native state.

      For Abeta40, which can be classified as an IDP with charged residues, it is shown that ATP disrupts the salt bridge (D23-K28) required for the stability of beta-turn formation.

      By showing that ATP can disassemble preformed protein oligomers (Abeta40 dimer), the authors argue that ATP is "potent enough to disassemble existing protein droplets, maintaining proper cellular homeostasis," and enhancing solubility.

      Overall, the message of the paper is clear and straightforward to follow. I did not follow all the literature, but I see in the literature search, that there are several studies on this subject. (J. Am. Chem. Soc. 2021, 143, 31, 11982-11993; J. Phys. Chem. B 2022, 126, 42, 8486-8494; J. Phys. Chem. B 2021, 125, 28, 7717-7731; J. Phys. Chem. B 2020, 124, 1, 210-223).

      If this study is indeed the first one to test using MD simulations whether ATP is a solubilizer of protein aggregates, it may deserve some attention from the community. But, the authors should definitely discuss the content of existing studies, and make it explicit what is new in this study.

      Strengths:

      The authors showed that due to its amphiphilic nature, ATP can interact with different proteins in a protein-specific manner, a. finding more general and specific than merely calling ATP a biological hydrotrope.

      Weaknesses:

      (1) My only major concern is that the simulations were performed at unusually high ATP concentrations (100 and 500 mM of ATP), whereas the real cellular concentration of ATP is 1-5 mM. Even if ATP is a good solubilizer of protein aggregates, the actual concentration should matter. I was wondering if there is a previous report on a titration curve of protein aggregates against ATP, and what is the transition mid-point of ATP-induced solubility of protein aggregates.

      For instance, urea or GdmCl have long been known as the non-specific denaturants of proteins, and it has been well experimented that their transition mid-point of protein unfolding is ~(1 - 6) M depending on the proteins.

      (2) The sentence "... a clear shift of relative population of Abeta40 conformational subensemble towards a basin with higher Rg and lower number of contacts in the presence of ATP" is not a precise description of Figures 4A and 4B. It is not clear from the figures whether the Rg of Abeta40 is increased when Abeta40 is subject to ATP. The authors should give a more precise description of what is observed in the result from their simulations or consider a better-order parameter to describe the change in molecular structure. In addition, the disruption of beta-sheet from Figure 4E to 4F is not very clear. The authors may want to use an arrow to indicate the region of the contact map associated with this change.

      Although the full atomistic simulations were carried out, the analyses demonstrated in this study are a bit rudimentary and coarse-grained (e.g, Rg is a rather poor order parameter to discuss dynamics involved in proteins). The authors could go beyond and say more about how ATP interacts with proteins and disrupts the stable configurations.

      (3) Although the amphiphilic character of ATP is highlighted, a similar comment can be made as to GTP. Is GTP, whose cellular concentration is ~0.5 mM, also a good solubilizer of protein aggregates? If not, why? Please comment.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Major comments (Public Reviews)

      Generality of grid cells

      We appreciate the reviewers’ concern regarding the generality of our approach, and in particular for analogies in nonlinear spaces. In that regard, there are at least two potential directions that could be pursued. One is to directly encode nonlinear structures (such as trees, rings, etc.) with grid cells, to which DPP-A could be applied as described in our model. The TEM model [1] suggests that grid cells in the medial entorhinal may form a basis set that captures structural knowledge for such nonlinear spaces, such as social hierarchies and transitive inference when formalized as a connected graph. Another would be to use eigen-decomposition of the successor representation [2], a learnable predictive representation of possible future states that has been shown by Stachenfield et al. [3] to provide an abstract structured representation of a space that is analogous to the grid cell code. This general-purpose mechanism could be applied to represent analogies in nonlinear spaces [4], for which there may not be a clear factorization in terms of grid cells (i.e., distinct frequencies and multiple phases within each frequency). Since the DPP-A mechanism, as we have described it, requires representations to be factored in this way it would need to be modified for such purpose. Either of these approaches, if successful, would allow our model to be extended to domains containing nonlinear forms of structure. To the extent that different coding schemes (i.e., basis sets) are needed for different forms of structure, the question of how these are identified and engaged for use in a given setting is clearly an important one, that is not addressed by the current work. We imagine that this is likely subserved by monitoring and selection mechanisms proposed to underlie the capacity for selective attention and cognitive control [5], though the specific computational mechanisms that underlie this function remain an important direction for future research. We have added a discussion of these issues in Section 6 of the updated manuscript.

      (1) Whittington, J.C., Muller, T.H., Mark, S., Chen, G., Barry, C., Burgess, N. and Behrens, T.E., 2020. The Tolman-Eichenbaum machine: unifying space and relational memory through generalization in the hippocampal formation. Cell, 183(5), pp.1249-1263.

      (2) Dayan, P., 1993. Improving generalization for temporal difference learning: The successor representation. Neural computation, 5(4), pp.613-624.

      (3) Stachenfeld, K.L., Botvinick, M.M. and Gershman, S.J., 2017. The hippocampus as a predictive map. Nature neuroscience, 20(11), pp.1643-1653.

      (4) Frankland, S., Webb, T.W., Petrov, A.A., O'Reilly, R.C. and Cohen, J., 2019. Extracting and Utilizing Abstract, Structured Representations for Analogy. In CogSci (pp. 1766-1772).

      (5) Shenhav, A., Botvinick, M.M. and Cohen, J.D., 2013. The expected value of control: an integrative theory of anterior cingulate cortex function. Neuron, 79(2), pp.217-240. Biological plausibility of DPP-A

      We appreciate the reviewers’ interest in the biological plausibility of our model, and in particular the question of whether and how DPP-A might be implemented in a neural network. In that regard, Bozkurt et al. [1] recently proposed a biologically plausible neural network algorithm using a weighted similarity matrix approach to implement a determinant maximization criterion, which is the core idea underlying the objective function we use for DPP-A, suggesting that the DPP-A mechanism we describe may also be biologically plausible. This could be tested experimentally by exposing individuals (e.g., rodents or humans) to a task that requires consistent exposure to a subregion, and evaluating the distribution of activity over the grid cells. Our model predicts that high frequency grid cells should increase their firing rate more than low frequency cells, since the high frequency grid cells maximize the determinant of the covariance matrix of the grid cell embeddings. It is also worth noting that Frankland et al. [2] have suggested that the use of DPPs may also help explain a mutual exclusivity bias observed in human word learning and reasoning. While this is not direct evidence of biological plausibility, it is consistent with the idea that the human brain selects representations for processing that maximize the volume of the representational space, which can be achieved by maximizing the DPP-A objective function defined in Equation 6. We have added a comment to this effect in Section 6 of the updated manuscript.

      (1) Bozkurt, B., Pehlevan, C. and Erdogan, A., 2022. Biologically-plausible determinant maximization neural networks for blind separation of correlated sources. Advances in Neural Information Processing Systems, 35, pp.13704-13717.

      (2) Frankland, S. and Cohen, J., 2020. Determinantal Point Processes for Memory and Structured Inference. In CogSci.

      Simplicity of analogical problem and comparison to other models using this task

      First, we would like to point out that analogical reasoning is a signatory feature of human cognition, which supports flexible and efficient adaptation to novel inputs that remains a challenge for most current neural network architectures. While humans can exhibit complex and sophisticated forms of analogical reasoning [1, 2, 3], here we focused on a relatively simple form, that was inspired by Rumelhart’s parallelogram model of analogy [4,5] that has been used to explain traditional human verbal analogies (e.g., “king is to what as man is to woman?”). Our model, like that one, seeks to explain analogical reasoning in terms of the computation of simple Euclidean distances (i.e., A - B = C - D, where A, B, C, D are vectors in 2D space). We have now noted this in Section 2.1.1 of the updated manuscript. It is worth noting that, despite the seeming simplicity of this construction, we show that standard neural network architectures (e.g., LSTMs and transformers) struggle to generalize on such tasks without the use of the DPP-A mechanism.

      Second, we are not aware of any previous work other than Frankland et al. [6] cited in the first paragraph of Section 2.2.1, that has examined the capacity of neural network architectures to perform even this simple form of analogy. The models in that study were hardcoded to perform analogical reasoning, whereas we trained models to learn to perform analogies. That said, clearly a useful line of future work would be to scale our model further to deal with more complex forms of representation and analogical reasoning tasks [1,2,3]. We have noted this in Section 6 of the updated manuscript.

      (1) Holyoak, K.J., 2012. Analogy and relational reasoning. The Oxford handbook of thinking and reasoning, pp.234-259.

      (2) Webb, T., Fu, S., Bihl, T., Holyoak, K.J. and Lu, H., 2023. Zero-shot visual reasoning through probabilistic analogical mapping. Nature Communications, 14(1), p.5144.

      (3) Lu, H., Ichien, N. and Holyoak, K.J., 2022. Probabilistic analogical mapping with semantic relation networks. Psychological review.

      (4) Rumelhart, D.E. and Abrahamson, A.A., 1973. A model for analogical reasoning. Cognitive Psychology, 5(1), pp.1-28.

      (5) Mikolov, T., Chen, K., Corrado, G. and Dean, J., 2013. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781.

      (6) Frankland, S., Webb, T.W., Petrov, A.A., O'Reilly, R.C. and Cohen, J., 2019. Extracting and Utilizing Abstract, Structured Representations for Analogy. In CogSci (pp. 1766-1772).

      Clarification of DPP-A attentional modulation

      We would like to clarify several concerns regarding the DPP-A attentional modulation. First, we would like to make it clear that ω is not meant to correspond to synaptic weights, and thank the reviewer for noting the possibility for confusion on this point. It is also distinct from a biasing input, which is often added to the product of the input features and weights. Rather, in our model ω is a vector, and diag (ω) converts it into a matrix with ω as the diagonal of the matrix, and the rest entries are zero. In Equation 6, diag(ω) is matrix multiplied with the covariance matrix V, which results in elementwise multiplication of ω with column vectors of V, and hence acts more like gates. We have noted this in Section 2.2.2 and have changed all instances of “weights (ω)” to “gates (ɡ)” in the updated manuscript. We have also rewritten the definition of Equation 6 and uses of it (as in Algorithm 1) to depict the use of sigmoid nonlinearity (σ) to , so that the resulting values are always between 0 and 1.

      Second, we would like to clarify that we don’t compute the inner product between the gates ɡ and the grid cell embeddings x anywhere in our model. The gates within each frequency were optimized (independent of the task inputs), according to Equation 6, to compute the approximate maximum log determinant of the covariance matrix over the grid cell embeddings individually for each frequency. We then used the grid cell embeddings belonging to the frequency that had the maximum within-frequency log determinant for training the inference module, which always happened to be grid cells within the top three frequencies. Author response image 1 (also added to the Appendix, Section 7.10 of the updated manuscript) shows the approximate maximum log determinant (on the y-axis) for the different frequencies (on the x-axis).

      Author response image 1.

      Approximate maximum log determinant of the covariance matrix over the grid cell embeddings (y-axis) for each frequency (x-axis), obtained after maximizing Equation 6.

      Third, we would like to clarify our interpretation of why DPP-A identified grid cell embeddings corresponding to the highest spatial frequencies, and why this produced the best OOD generalization (i.e., extrapolation on our analogy tasks). It is because those grid cell embeddings exhibited greater variance over the training data than the lower frequency embeddings, while at the same time the correlations among those grid cell embeddings were lower than the correlations among the lower frequency grid cell embeddings. The determinant of the covariance matrix of the grid cell embeddings is maximized when the variances of the grid cell embeddings are high (they are “expressive”) and the correlation among the grid cell embeddings is low (they “cover the representational space”). As a result, the higher frequency grid cell embeddings more efficiently covered the representational space of the training data, allowing them to efficiently capture the same relational structure across training and test distributions which is required for OOD generalization. We have added some clarification to the second paragraph of Section 2.2.2 in the updated manuscript. Furthermore, to illustrate this graphically, Author response image 2 (added to the Appendix, Section 7.10 of the updated manuscript) shows the results after the summation of the multiplication of the grid cell embeddings over the 2d space of 1000x1000 locations, with their corresponding gates for 3 representative frequencies (left, middle and right panels showing results for the lowest, middle and highest grid cell frequencies, respectively, of the 9 used in the model), obtained after maximizing Equation 6 for each grid cell frequency. The color code indicates the responsiveness of the grid cells to different X and Y locations in the input space (lighter color corresponding to greater responsiveness). Note that the dark blue area (denoting regions of least responsiveness to any grid cell) is greatest for the lowest frequency and nearly zero for the highest frequency, illustrating that grid cell embeddings belonging to the highest frequency more efficiently cover the representational space which allows them to capture the same relational structure across training and test distributions as required for OOD generalization.

      Author response image 2.

      Each panel shows the results after summation of the multiplication of the grid cell embeddings over the 2d space of 1000x1000 locations, with their corresponding gates for a particular frequency, obtained after maximizing Equation 6 for each grid cell frequency. The left, middle, and right panels show results for the lowest, middle, and highest grid cell frequencies, respectively, of the 9 used in the model. Lighter color in each panel corresponds to greater responsiveness of grid cells at that particular location in the 2d space.

      Finally, we would like to clarify how the DPP-A attentional mechanism is different from the attentional mechanism in the transformer module, and why both are needed for strong OOD generalization. Use of the standard self-attention mechanism in transformers over the inputs (i.e., A, B, C, and D for the analogy task) in place of DPP-A would lead to weightings of grid cell embeddings over all frequencies and phases. The objective function for the DPP-A represents an inductive bias, that selectively assigns the greatest weight to all grid cell embeddings (i.e., for all phases) of the frequency for which the determinant of the covariance matrix is greatest computed over the training space. The transformer inference module then attends over the inputs with the selected grid cell embeddings based on the DPP-A objective. We have added a discussion of this point in Section 6 of the updated manuscript.

      We would like to thank the reviewers for their recommendations. We have tried our best to incorporate them into our updated manuscript. Below we provide a detailed response to each of the recommendations grouped for each reviewer.

      Reviewer #1 (Recommendations for the authors)

      (1) It would be helpful to see some equations for R in the main text.

      We thank the reviewer for this suggestion. We have now added some equations explaining the working of R in Section 2.2.3 of the updated manuscript.

      (2) Typo: p 11 'alongwith' -> 'along with'

      We have changed all instances of ‘alongwith’ to ‘along with’ in the updated manuscript.

      (3) Presumably, this is related to equivariant ML - it would be helpful to comment on this.

      Yes, this is related to equivariant ML, since the properties of equivariance hold for our model. Specifically, the probability distribution after applying softmax remains the same when the transformation (translation or scaling) is applied to the scores for each of the answer choices obtained from the output of the inference module, and when the same transformation is applied to the stimuli for the task and all the answer choices before presenting as input to the inference module to obtain the scores. We have commented on this in Section 2.2.3 of the updated manuscript.

      Reviewer #2 (Recommendations for the authors)

      (1) Page 2 - "Webb et al." temporal context - they should also cite and compare this to work by Marc Howard on generalization based on multi-scale temporal context.

      While we appreciate the important contributions that have been made by Marc Howard and his colleagues to temporal coding and its role in episodic memory and hippocampal function, we would like to clarify that his temporal context model is unrelated to the temporal context normalization developed by Webb et al. (2020) and mentioned on Page 2. The former (Temporal Context Model) is a computational model that proposes a role for temporal coding in the functions of the medial temporal lobe in support of episodic recall, and spatial navigation. The latter (temporal context normalization) is a normalization procedure proposed for use in training a neural network, similar to batch normalization [1], in which tensor normalization is applied over the temporal instead of the batch dimension, which is shown to help with OOD generalization. We apologize for any confusion engendered by the similarity of these terms, and failure to clarify the difference between these, that we have now attempted to do in a footnote on Page 2.

      Ioffe, S. and Szegedy, C., 2015, June. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In International conference on machine learning (pp. 448-456). pmlr.

      (2) page 3 - "known to be implemented in entorhinal" - It's odd that they seem to avoid citing the actual biology papers on grid cells. They should cite more of the grid cell recording papers when they mention the entorhinal cortex (i.e. Hafting et al., 2005; Barry et al., 2007; Stensola et al., 2012; Giocomo et al., 2011; Brandon et al., 2011).

      We have now cited the references mentioned below, on page 3 after the phrase “known to be implemented in entohinal cortex”.

      (1) Barry, C., Hayman, R., Burgess, N. and Jeffery, K.J., 2007. Experience-dependent rescaling of entorhinal grids. Nature neuroscience, 10(6), pp.682-684.

      (2) Stensola, H., Stensola, T., Solstad, T., Frøland, K., Moser, M.B. and Moser, E.I., 2012. The entorhinal grid map is discretized. Nature, 492(7427), pp.72-78.

      (3) Giocomo, L.M., Hussaini, S.A., Zheng, F., Kandel, E.R., Moser, M.B. and Moser, E.I., 2011. Grid cells use HCN1 channels for spatial scaling. Cell, 147(5), pp.1159-1170.

      (4) Brandon, M.P., Bogaard, A.R., Libby, C.P., Connerney, M.A., Gupta, K. and Hasselmo, M.E., 2011. Reduction of theta rhythm dissociates grid cell spatial periodicity from directional tuning. Science, 332(6029), pp.595-599.

      (3) To enhance the connection to biological systems, they should cite more of the experimental and modeling work on grid cell coding (for example on page 2 where they mention relational coding by grid cells). Currently, they tend to cite studies of grid cell relational representations that are very indirect in their relationship to grid cell recordings (i.e. indirect fMRI measures by Constaninescu et al., 2016 or the very abstract models by Whittington et al., 2020). They should cite more papers on actual neurophysiological recordings of grid cells that suggest relational/metric representations, and they should cite more of the previous modeling papers that have addressed relational representations. This could include work on using grid cell relational coding to guide spatial behavior (e.g. Erdem and Hasselmo, 2014; Bush, Barry, Manson, Burges, 2015). This could also include other papers on the grid cell code beyond the paper by Wei et al., 2015 - they could also cite work on the efficiency of coding by Sreenivasan and Fiete and by Mathis, Herz, and Stemmler.

      We thank the reviewer for bringing the additional references to our attention. We have cited the references mentioned below on page 2 of the updated manuscript.

      (1) Erdem, U.M. and Hasselmo, M.E., 2014. A biologically inspired hierarchical goal directed navigation model. Journal of Physiology-Paris, 108(1), pp.28-37.

      (2) Sreenivasan, S. and Fiete, I., 2011. Grid cells generate an analog error-correcting code for singularly precise neural computation. Nature neuroscience, 14(10), pp.1330-1337.

      (3) Mathis, A., Herz, A.V. and Stemmler, M., 2012. Optimal population codes for space: grid cells outperform place cells. Neural computation, 24(9), pp.2280-2317.

      (4) Bush, D., Barry, C., Manson, D. and Burgess, N., 2015. Using grid cells for navigation. Neuron, 87(3), pp.507-520

      (4) Page 3 - "Determinantal Point Processes (DPPs)" - it is rather annoying that DPP is defined after DPP-A is defined. There ought to be a spot where the definition of DPP-A is clearly stated in a single location.

      We agree it makes more sense to define Determinantal Point Process (DPP) before DPP-A. We have now rephrased the sentences accordingly. In the “Abstract”, the sentence now reads “Second, we propose an attentional mechanism that operates over the grid cell code using Determinantal Point Process (DPP), which we call DPP attention (DPP-A) - a transformation that ensures maximum sparseness in the coverage of that space.” We have also modified the second paragraph of the “Introduction”. The modified portion now reads “b) an attentional objective inspired from Determinantal Point Processes (DPPs), which are probabilistic models of repulsion arising in quantum physics [1], to attend to abstract representations that have maximum variance and minimum correlation among them, over the training data. We refer to this as DPP attention or DPP-A.” Due to this change, we removed the last sentence of the fifth paragraph of the “Introduction”.

      (1) Macchi, O., 1975. The coincidence approach to stochastic point processes. Advances in Applied Probability, 7(1), pp.83-122.

      (5) Page 3 - "the inference module R" - there should be some discussion about how this component using LSTM or transformers could relate to the function of actual brain regions interacting with entorhinal cortex. Or if there is no biological connection, they should state that this is not seen as a biological model and that only the grid cell code is considered biological.

      While we agree that the model is not construed to be as specific about the implementation of the R module, we assume that — as a standard deep learning component — it is likely to map onto neocortical structures that interact with the entorhinal cortex and, in particular, regions of the prefrontal-posterior parietal network widely believed to be involved in abstract relational processes [1,2,3,4]. In particular, the role of the prefrontal cortex in the encoding and active maintenance of abstract information needed for task performance (such as rules and relations) has often been modeled using gated recurrent networks, such as LSTMs [5,6], and the posterior parietal cortex has long been known to support “maps” that may provide an important substrate for computing complex relations [4]. We have added some discussion about this in Section 2.2.3 of the updated manuscript.

      (1) Waltz, J.A., Knowlton, B.J., Holyoak, K.J., Boone, K.B., Mishkin, F.S., de Menezes Santos, M., Thomas, C.R. and Miller, B.L., 1999. A system for relational reasoning in human prefrontal cortex. Psychological science, 10(2), pp.119-125.

      (2) Christoff, K., Prabhakaran, V., Dorfman, J., Zhao, Z., Kroger, J.K., Holyoak, K.J. and Gabrieli, J.D., 2001. Rostrolateral prefrontal cortex involvement in relational integration during reasoning. Neuroimage, 14(5), pp.1136-1149.

      (3) Knowlton, B.J., Morrison, R.G., Hummel, J.E. and Holyoak, K.J., 2012. A neurocomputational system for relational reasoning. Trends in cognitive sciences, 16(7), pp.373-381.

      (4) Summerfield, C., Luyckx, F. and Sheahan, H., 2020. Structure learning and the posterior parietal cortex. Progress in neurobiology, 184, p.101717.

      (5) Frank, M.J., Loughry, B. and O’Reilly, R.C., 2001. Interactions between frontal cortex and basal ganglia in working memory: a computational model. Cognitive, Affective, & Behavioral Neuroscience, 1, pp.137-160.

      (6) Braver, T.S. and Cohen, J.D., 2000. On the control of control: The role of dopamine in regulating prefrontal function and working memory. Control of cognitive processes: Attention and performance XVIII, (2000).

      (6) Page 4 - "Learned weighting w" - it is somewhat confusing to use "w" as that is commonly used for synaptic weights, whereas I understand this to be an attentional modulation vector with the same dimensionality as the grid cell code. It seems more similar to a neural network bias input than a weight matrix.

      We refer to the first paragraph of our response above to the topic “Clarification of DPP-A attentional modulation” under “Major comments (Public Reviews)”, which contains our response to this issue.

      (7) Page 4 - "parameterization of w... by two loss functions over the training set." - I realize that this has been stated here, but to emphasize the significance to a naïve reader, I think they should emphasize that the learning is entirely focused on the initial training space, and there is NO training done in the test spaces. It's very impressive that the parameterization is allowing generalization to translated or scaled spaces without requiring ANY training on the translated or scaled spaces.

      We have added the sentence “Note that learning of parameter occurs only over the training space and is not further modified during testing (i.e. over the test spaces)” to the updated manuscript.

      (8) Page 4 - "The first," - This should be specific - "The first loss function"

      We have changed it to “The first loss function” in the updated manuscript.

      (9) Page 4 - The analogy task seems rather simplistic when first presented (i.e. just a spatial translation to different parts of a space, which has already been shown to work in simulations of spatial behavior such as Erdem and Hasselmo, 2014 or Bush, Barry, Manson, Burgess, 2015). To make the connection to analogy, they might provide a brief mention of how this relates to the analogy space created by word2vec applied to traditional human verbal analogies (i.e. king-man+woman=queen).

      We agree that the analogy task is simple, and recognize that grid cells can be used to navigate to different parts of space over which the test analogies are defined when those are explicitly specified, as shown by Erdem and Hasselmo (2014) and Bush, Barry, Manson, and Burgess (2015). However, for the analogy task, the appropriate set of grid cell embeddings must be identified that capture the same relational structure between training and test analogies to demonstrate strong OOD generalization, and that is achieved by the attentional mechanism DPP-A. As suggested by the reviewer’s comment, our analogy task is inspired by Rumelhart’s parallelogram model of analogy [1,2] (and therefore similar to traditional human verbal analogies) in as much as it involves differences (i.e A - B = C - D, where A, B, C, D are vectors in 2D space). We have now noted this in Section 2.1.1 of the updated manuscript.

      (1) Rumelhart, D.E. and Abrahamson, A.A., 1973. A model for analogical reasoning. Cognitive Psychology, 5(1), pp.1-28.

      (2) Mikolov, T., Chen, K., Corrado, G. and Dean, J., 2013. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781.

      (10) Page 5 - The variable "KM" is a bit confusing when it first appears. It would be good to re-iterate that K and M are separate points and KM is the vector between these points.

      We apologize for the confusion on this point. KM is meant to refer to an integer value, obtained by multiplying K and M, which is added to both dimensions of A, B, C and D, which are points in ℤ2, to translate them to a different region of the space. K is an integer value ranging from 1 to 9 and M is also an integer value denoting the size of the training region, which in our implementation is 100. We have clarified this in Section 2.1.1 of the updated manuscript.

      (11) Page 5 - "two continuous dimensions (Constantinescu et al._)" - this ought to give credit to the original study showing the abstract six-fold rotational symmetry for spatial coding (Doeller, Barry and Burgess).

      We have now cited the original work by Doeller et al. [1] along with Constantinescu et al. (2016) in the updated manuscript after the phrase “two continuous dimensions” on page 5.

      (1) Doeller, C.F., Barry, C. and Burgess, N., 2010. Evidence for grid cells in a human memory network. Nature, 463(7281), pp.657-661.

      (12) Page 6 - Np=100. This is done later, but it would be clearer if they right away stated that Np*Nf=900 in this first presentation.

      We have now added this sentence after Np=100. “Hence Np*Nf=900, which denotes the number of grid cells.”

      (13) Page 6 - They provide theorem 2.1 on the determinant of the covariance matrix of the grid code, but they ought to cite this the first time this is mentioned.

      We have cited Gilenwater et al. (2012) before mentioning theorem 2.1. The sentence just before that reads “We use the following theorem from Gillenwater et al. (2012) to construct :”

      (14) Page 6 - It would greatly enhance the impact of the paper if they could give neuroscientists some sense of how the maximization of the determinant of the covariance matrix of the grid cell code could be implemented by a biological circuit. OR at least to show an example of the output of this algorithm when it is used as an inner product with the grid cell code. This would require plotting the grid cell code in the spatial domain rather than the 900 element vector.

      We refer to our response above to the topic “Biological plausibility of DPP-A” and second, third, and fourth paragraphs of our response above to the topic “Clarification of DPP-A attentional modulation” under “Major comments (Public Reviews)”, which contain our responses to this issue.

      (15) Page 6 - "That encode higher spatial frequencies..." This seems intuitive, but it would be nice to give a more intuitive description of how this is related to the determinant of the covariance matrix.

      We refer to the third paragraph of our response above to the topic “Clarification of DPP-A attentional modulation” under “Major comments (Public Reviews)”, which contains our response to this issue.

      (16) Page 7 - log of both sides... Nf is number of frequencies... Would be good to mention here that they are referring to equation 6 which is only mentioned later in the paragraph.

      As suggested, we now refer to Equation 6 in the updated manuscript. The sentence now reads “This is achieved by maximizing the determinant of the covariance matrix over the within frequency grid cell embeddings of the training data, and Equation 6 is obtained by applying the log on both sides of Theorem 2.1, and in our case where refers to grid cells of a particular frequency.”

      (17) Page 7 - Equation 6 - They should discuss how this is proposed to be implemented in brain circuits.

      We refer to our response above to the topic “Biological plausibility of DPP-A” under “Major comments (Public Reviews)”, which contains our response to this issue.

      18) Page 9 - "egeneralize" - presumably this is a typo?

      Yes. We have corrected it to “generalize” in the updated manuscript.

      (19) Page 9 - "biologically plausible encoding scheme" - This is valid for the grid cell code, but they should be clear that this is not valid for other parts of the model, or specify how other parts of the model such as DPP-A could be biologically plausible.

      We refer to our response above to the topic “Biological plausibility of DPP-A” under “Major comments (Public Reviews)”, which contains our response to this issue.

      (20) Page 12 - Figure 7 - comparsion to one-hots or smoothed one-hots. The text should indicate whether the smoothed one-hots are similar to place cell coding. This is the most relevant comparison of coding for those knowledgeable about biological coding schemes.

      Yes, smoothed one-hots are similar to place cell coding. We now mention this in Section 5.3 of the updated manuscript.

      (21) Page 12 - They could compare to a broader range of potential biological coding schemes for the overall space. This could include using coding based on the boundary vector cell coding of the space, band cell coding (one dimensional input to grid cells), or egocentric boundary cell coding.

      We appreciate these useful suggestions, which we now mention as potentially valuable directions for future work in the second paragraph of Section 6 of the updated manuscript.

      (22) Page 13 - "transformers are particularly instructive" - They mention this as a useful comparison, but they might discuss further why a much better function is obtained when attention is applied to the system twice (once by DPP-A and then by a transformer in the inference module).

      We refer to the last paragraph of our response above to the topic “Clarification of DPP-A attentional modulation” under “Major comments (Public Reviews)”, which contains our response to this issue.

      (23) Page 13 - "Section 5.1 for analogy and Section 5.2 for arithmetic" - it would be clearer if they perhaps also mentioned the specific figures (Figure 4 and Figure 6) presenting the results for the transformer rather than the LSTM.

      We have now rephrased to also refer to the figures in the updated manuscript. The phrase now reads “a transformer (Figure 4 in Section 5.1 for analogy and Figure 6 in Section 5.2 for arithmetic tasks) failed to achieve the same level of OOD generalization as the network that used DPP-A.”

      (24) Page 14 - "statistics of the training data" - The most exciting feature of this paper is that learning during the training space analogies can so effectively generalize to other spaces based on the right attention DPP-A, but this is not really made intuitive. Again, they should illustrate the result of the xT w inner product to demonstrate why this work so effectively!

      We refer to the second, third, and fourth paragraphs of our response above to the topic “Clarification of DPP-A attentional modulation” under “Major comments (Public Reviews)”, which contains our response to this issue.

      (25) Bibliography - Silver et al., go paper - journal name "nature" should be capitalized. There are other journal titles that should be capitalized. Also, I believe eLife lists family names first.

      We have made the changes to the bibliography of the updated manuscript suggested by the reviewer.

    1. eLife assessment

      This important modeling work demonstrates out-of-distribution generalization using a grid cell coding scheme combined with Determinantal Point Process Attention. The simulations provide convincing evidence that the model improves generalization performance across several tasks. The generality of the approach is unclear, however, and there is limited comparison to relevant prior work.

    2. Reviewer #1 (Public Review):

      Summary:<br /> This paper presents a cognitive model of out-of-distribution generalisation, where the representational basis is grid-cell codes. In particular, the authors consider the tasks of analogies, addition, and multiplication, and the out-of-distribution tests are shifting or scaling the input domain. The authors utilise grid cell codes, which are multi-scale as well as translationally invariant due to their periodicity. To allow for domain adaptation, the authors use DPP-A which is, in this context, a mechanism of adapting to input scale changes. The authors present simulation results demonstrating that this model can perform out-of-distribution generalisation to input translations and re-scaling, whereas other models fail.

      Strengths:<br /> This paper makes the point it sets out to - that there are some underlying representational bases, like grid cells, that when combined with a domain adaptation mechanism, like DPP-A, can facilitate out-of-generalisation. I don't have any issues with the technical details.

      Weaknesses:<br /> The paper does leave open the bigger questions of 1) how one learns a suitable representation basis in the first place, 2) how to have a domain adaptation mechanism that works in more general settings other than adapting to scale. Overall, I'm left wondering whether this model is really quite bespoke or whether there is something really general here. My comments below are trying to understand how general this approach is.

      COMMENTS<br /> This work relies on being able to map inputs into an appropriate representational space. The inputs were integers so it's easy enough to map them to grid locations. But how does this transfer to making analogies in other spaces? Do the inputs need to be mapped (potentially non-linearly) into a space where everything is linear? In general, what are the properties of the embedding space that allows the grid code to be suitable? It would be helpful to know just how much leg work an embedding model would have to do.

      It's natural that grid cells are great for domain shifts of translation, rescaling, and rotation, because they themselves are multi-scaled and are invariant to translations and rotations. But grid codes aren't going to be great for other types of domain shifts. Are the authors saying that to make analogies grid cells are all you need? If not then what else? And how does this representation get learned? Are there lots of these invariant codes hanging around? And if so how does the appropriate one get chosen for each situation? Some discussion of the points is necessary as otherwise, the model seems somewhat narrow in scope.

      For effective adaptation of scale, the authors needed to use DPP-A. Being that they are relating to brains using grid codes, what processes are implementing DPP-A? Presumably, a computational module that serves the role of DPP-A could be meta-learned? I.e. if they change their task set-up so it gets to see domain shifts in its training data an LSTM or transformer could learn to do this. The presented model comparisons feel a bit of a straw man.

      I couldn't see it explained exactly how R works.

    3. Reviewer #2 (Public Review):

      Summary:<br /> This paper presents a model of out-of-distribution (OOD) generalization that focuses on modeling an analogy task, in which translation or scaling is tested with training in one part of the space and testing in other areas of the space progressively more distant from the training location. Similar tests were performed on arithmetic including addition and multiplication, and similarly impressive results appear for addition but not multiplication. The authors show that a grid cell coding scheme helps performance on these analogy and arithmetic tasks, but the most dramatic increase in performance is provided by a complex algorithm for distributional point-process attention (DPP-A) based on maximizing the determinant of the covariance matrix of the grid embeddings.

      Strengths:<br /> The results appear quite impressive. The results for generalization appear quite dramatic when compared to other coding schemes (i.e. one-hot) or when compared to the performance when ablating the DPP-A component but retaining the same inference modules using LSTM or transformers. This appears to be an important result in terms of generalization of results in an analogy space.

      Weaknesses:<br /> There are a number of ways that its impact and connection to grid cells could be enhanced. From the neuroscience perspective, the major comments concern making a clearer and stronger connection to the actual literature on grid cells and grid cell modeling, and discussing the relationship of the complex DPP-A algorithm to biological circuits.

      Major comments:<br /> 1. They should provide more citations to other groups that have explored analogy using this type of task. Currently, they only cite one paper (Webb et al., 2020) by their own group in their footnote 1 which used the same representation of behavioral tasks for generalization of analogy. It would be useful if they could cite other papers using this simplified representation of analogy and also show the best performance of other algorithms from other groups in their figures, so that there is a sense of how their results compare to the best previous algorithm by other groups in the field (or they can identify which of their comparison algorithms corresponds to the best of previously published work).

      2. While the grid code they use is very standard and based on grid cell researchers (Bicanski and Burgess, 2019), the rest of the algorithm doesn't have a clear claim on biological plausibility. It has become somewhat standard in the field to ignore the problem of how the brain could biologically implement the latest complex algorithm, but it would be useful if they at least mention the problem (or difficulty) of implementing DPP-A in a biological network. In particular, does maximizing the determinant of the covariance matrix of the grid code correspond to something that could be tested experimentally?

      3. Related to major comment 2., it would be very exciting if they could show what the grid code looks like after the attentional modulation inner product xT w has been implemented. This could be highly useful for experimental researchers trying to connect these theoretical simulation results to data. This would be most intuitive to grid cell researchers if it is plotted in the same format as actual biological experimental data - specifically which grid cell codes get strengthened the most (beyond just the highest frequencies).

      4. To enhance the connection to biological systems, they should cite more of the experimental and modeling work on grid cell coding (for example on page 2 where they mention relational coding by grid cells). Currently, they tend to cite studies of grid cell relational representations that are very indirect in their relationship to grid cell recordings (i.e. indirect fMRI measures by Constaninescu et al., 2016 or the very abstract models by Whittington et al., 2020). They should cite more papers on actual neurophysiological recordings of grid cells that suggest relational/metric representations, and they should cite more of the previous modeling papers that have addressed relational representations. This could include work on using grid cell relational coding to guide spatial behavior (e.g. Erdem and Hasselmo, 2014; Bush, Barry, Manson, Burges, 2015). This could also include other papers on the grid cell code beyond the paper by Wei et al., 2015 - they could also cite work on the efficiency of coding by Sreenivasan and Fiete and by Mathis, Herz, and Stemmler.

    1. Reviewer #4 (Public Review):

      In this manuscript by Sha et al. the authors test the role of TNFa in modulating tumor regression/recurrence under therapeutic pressure from castration (or enzalutamide) in both in vitro and in vivo models of prostate cancer. Using the PTEN-null genetic mouse model, they compare the effect of a TNFα ligand trap, etanercept, at various points pre- and post-castration. Their most interesting findings from this experiment were that etanercept given 3 days prior to castration prevented tumor regression, which is a common phenotype seen in these models after castration, but etanercept given 1 day prior to castration prevented prostate cancer recurrence after castration. They go on to perform RNA sequencing on tumors isolated from either sham or castrate mice from two time points post-castration to study acute and delayed transcriptional responses to androgen deprivation. They found enrichment of gene sets containing TNF-targets which initially decrease post-castration but are elevated by 35 days, the time at which tumors recur. The authors conduct a similar set of experiments using human prostate cancer cell lines treated with the androgen receptor inhibitor enzalutamide and observe that drug treatment leads to cells with basal stem-like features that express high levels of TNF. They noticed that CCL2 levels correlate with changes in TNF levels raising the possibility that CCL2 might be a critical downstream effector for disease recurrence. To this end, they treated PTEN-null and hi-MYC castrated mice with a CCR2-antagonist (CCR2a) because CCR2 is one receptor of CCL2 and monitors tumor growth dynamics. Interestingly, upon treatment with CCR2a, tumors did not recur according to their measurements. They go on to demonstrate that the tumors pre-treated with CCR2a had reduced levels of putative TAMs and increased CTLs in the context of TNF or CCR2 inhibition providing a cellular context associated with disease regression. Lastly, they perform single-cell RNA sequencing to further characterize the tumor microenvironment post-castration and report that the ratio of CTLs to TAMs is lower in a recurrent tumor.

      While the concepts behind the study have merit, the data are incomplete and do not fully support the authors' conclusions. The author's definition of recurrence is subjective given that the amount of disease regression after castration is both variable (Figure 8) and relatively limited, particularly in the PTEN loss model. Critical controls are missing. For example, both drug experiments were completed without treating non-castrate plus drug controls which raises the question of how specific these findings are to castration resistance. No validation was performed to ensure that either the TNF ligand trap or the CCR2 agonist was acting on target. The single-cell sequencing experiments were done without replicates which raises concern about its interpretation. At a conceptual level, the authors say that a major cause of disease recurrence in the immunosuppressive TME, but provide little functional data that macrophages and T cells are directly responsible for this phenotype. Statistical analyses were performed on only select experiments. In summary, further work is recommended to support the conclusions of this story.

    2. eLife assessment

      This study presents a potentially valuable finding regarding the role of cytokine signaling in the mechanism of response and resistance to castration therapy in prostate cancer. The evidence, although solid for some aspects of the work, is incomplete and only partially supports the main claims.

    3. Reviewer #1 (Public Review):

      Summary:<br /> Sha K et al aimed at identifying the mechanism of response and resistance to castration in the Pten knockout GEM model. They found elevated levels of TNF overexpressed in castrated tumors associated with an expansion of basal-like stem cells during recurrence, which they show occurring in prostate cancer cells in culture upon enzalutamide treatment. Further, the authors carry on a timed dependent analysis of the role of TNF in regression and recurrence to show that TNF regulates both processes. Similarly, CCL2, which the authors had proposed as a chemokine secreted upon TNF induction following enzalutamide treatment, is also shown to be elevated during recurrence and associated with the remodeling of an immunosuppressive microenvironment through depletion of T cells and recruitment of TAMs.

      Strengths:

      The paper exploits a well-established GEM model to interrogate mechanisms of response to standard-of-care treatment. This is of utmost importance since prostate cancer recurrence after ADT or ARSi marks the onset of an incurable disease stage for which limited treatments exist. The work is relevant in the confirmation that recurrent prostate cancer is mostly an immunologically "cold" tumor with an immunosuppressive immune microenvironment

      Weaknesses:

      While the data is consistent and the conclusions are mostly supported and justified, the findings overall are incremental and of limited novelty. The role of TNF and NF-kB signaling in tumor progression and the role of the CCL2-CCR2 in shaping the immunosuppressive microenvironment are well established.

      On the other hand, it is unclear why the authors decided to focus on the basal compartment when there is a wealth of literature suggesting that luminal cells are if not exclusively, surely one of the cells of origin of prostate cancer and responsible for recurrence upon antiandrogen treatment. As a result, most of the later shown data has to be taken with caution as it is not known if the same phenomena occur in the luminal compartment.

    4. Reviewer #2 (Public Review):

      Summary:

      In this study, Sha and Zhang et al. reported that androgen deprivation therapy (ADT) induces a switch to a basal-stemness status, driven by the TNF-CCL2-CCR2 axis. Their results also reveal that enhanced CCL2 coincides with increased macrophages and decreased CD8 T cells, suggesting that ADT resistance may be related to the TNF/CCL2/CCR2-dependent immunosuppressive tumor microenvironment (TME). Overall, this is a very interesting study with a significant amount of data.

      Strengths:

      The strengths of the study include various clinically relevant models, cutting-edge technology (such as single-cell RNA-seq), translational potential (TNF and CCR2 inhibitors), and novel insights connecting stemness lineage switch to an immunosuppressive TME. Thus, I believe this work would be of significant interest to the field of prostate cancer and journal readership.

      Weaknesses:

      (1) One of the key conclusions/findings of this study is the ADT-induced basal-stemness lineage switch driving ADT resistance. However, most of the presented evidence supporting this conclusion only selects a couple of marker genes. What exacerbates this issue is that different basal-stemness markers were often selected with different results. For example, Figure S1A uses CD166/EZH2 as markers, while Figure S1B uses ITGb1/EZH2. In contrast, Figure 1D uses Sca1/CD49, and Figure 2B-C uses CD49/CD166. Since many basal-stemness lineage gene signatures have been previously established, the study should examine various basal-stemness gene signatures rather than a couple of selected markers. Moreover, why were none of the stemness/basal-gene signatures significantly changed in the GO enrichment analysis in Figure 6A/B?

      (2) A related weakness is the lack of functional results supporting the stemness lineage switch. Although the authors present colony formation assay results, these could be influenced simply by promoted cell proliferation, which is not a convincing indicator of stemness. To support this key conclusion, widely accepted stemness assays, such as the prostasphere formation assay (in vitro) and Extreme Limiting Dilution Analysis (ELDA) xenograft assay (in vivo), should be carried out.

      (3) Another significant concern is that this study uses concurrency to demonstrate a causal relationship in many key results, which is entirely different. For example, Figure S4A and S4B only show increased CCL2 and TNF secretion simultaneously, which cannot support that CCL2 is dependent on TNF. Similarly, Figure 5A only shows that CCL2 increased coincidently with a rise in TNF, which cannot support a causal relationship. To support the causal relationship of this conclusion, it is necessary to show that TNF-KO/KD would abolish the increased CCL2 secretion.

      (4) Some of the selective data presentations are not explained and are difficult to understand. For example, why does CD49 staining in Figure S3A have data for all four time points, while CD166 in Figure S3D only has data for the last time point (day 21)? Similarly, although several TNF_UP gene signatures were highlighted in Figure 4B, several TNF_DN signatures were also enriched in the same table, such as RUAN_RESPONSE_TO_TNF_DN. What is the explanation for these contrasting results?

    5. Reviewer #3 (Public Review):

      Summary:

      The current manuscript evaluates the role of TNF in promoting AR targeted therapy regression and subsequent resistance through CCL2 and TAMs. The current evidence supports a correlative role for TNF in promoting cancer cell progression following AR inhibition. Weaknesses include a lack of descriptive methodology of the pre-clinical GEM model experiments and it is not well-defined which cell types are impacted in this pre-clinical model which will be quite heterogenous with regards to cancer, normal, and microenvironment cells.

      Strengths:

      (1) Appropriate use of pre-clinical models and GEM models to address the scientific questions.

      (2) Novel finding of TNF and interplay of TAMs in promoting cancer cell progression following AR inhibition.

      (3) Potential for developing novel therapeutic strategies to overcome resistance to AR blockade.

      Weaknesses:

      (1) There is a lack of description regarding the GEM model experiments - the age at which mice experiments are started.

      (2) Tumor volume measurements are provided but in this context, there is no discussion on how the mixed cancer and normal epithelial and microenvironment is impacted by AR therapy which could lead to the subtle changes in tumor volume.

      (3) There are no readouts for target inhibition across the therapeutic pre-clinical trials or dosing time courses.

      (4) The terminology of regression and resistance appears arbitrary. The data seems to demonstrate a persistence of significant disease that progresses, rather than a robust response with minimal residual disease that recurs within the primary tumor.

      (5) It is unclear if the increase in basal-like stem cells is from normal basal cells or cancer cells with a basal stem-like property.

      6) In the Hi-MYC model, MYC expression is regulated by AR inhibition and is profoundly ARi responsive at early time points.

    1. eLife assessment

      This important work introduces a method to express fluorogenic DNA aptamers in E. coli, paving the way for genetically encoded fluorescent DNA. The evidence supporting the conclusions is solid, consisting of comparisons of the aptamer's activity in vitro and within bacterial cells. This advancement described in this study is likely to become a standard technique in the DNA aptamer field, and the work will be of interest and utility to researchers in synthetic biology, molecular imaging, and bacterial genetics fields.

    2. Reviewer #1 (Public Review):

      Summary:

      The authors use an interesting expression system called a retron to express single-stranded DNA aptamers. Expressing DNA as a single-stranded sequence is very hard - DNA is naturally double-stranded. However, the successful demonstration by the authors of expressing Lettuce, which is a fluorogenic DNA aptamer, allowed visual demonstration of both expression and folding. This method will likely be the main method for expressing and testing DNA aptamers of all kinds, including fluorogenic aptamers like Lettuce and future variants/alternatives.

      Strengths:

      This has an overall simplicity which will lead to ready adoption. I am very excited about this work. People will be able to express other fluorogenic aptamers or DNA aptamers tagged with Lettuce with this system.

      Weaknesses:

      Several things are not addressed/shown:

      (1) How stable are these DNA in cells? Half-life?

      (2) What concentration do they achieve in cells/copy numbers? This is important since it relates to the total fluorescence output and, if the aptamer is meant to bind a protein, it will reveal if the copy number is sufficient to stoichiometrically bind target proteins. Perhaps the gels could have standards with known amounts in order to get exact amounts of aptamer expression per cell?

      (3) Microscopic images of the fluorescent E. coli - why are these not shown (unless I missed them)? It would be good to see that cells are fluorescent rather than just showing flow sorting data.

      (4) I would appreciate a better Figure 1 to show all the intermediate steps in the RNA processing, the subsequent beginning of the RT step, and then the final production of the ssDNA. I did not understand all the processing steps that lead to the final product, and the role of the 2'OH.

      (5) I would like a better understanding or a protocol for choosing insertion sites into MSD for other aptamers - people will need simple instructions.

      (6) Can the gels be stained with DFHBI/other dyes to see the Lettuce as has been done for fluorogenic RNAs?

      (7) Sometimes FLAPs are called fluorogenic RNA aptamers - it might be good to mention both terms initially since some people use fluorogenic aptamer as their search term.

      (8) What E coli strains are compatible with this retron system?

      (9) What steps would be needed to use in mammalian cells?

      (10) Is the conjugated RNA stable and does it degrade to leave just the DNA aptamer?

    3. Reviewer #2 (Public Review):

      Summary:

      This manuscript explores a DNA fluorescent light-up aptamer (FLAP) with the specific goal of comparing activity in vitro to that in bacterial cells. In order to achieve expression in bacteria, the authors devise an expression strategy based on retrons and test four different constructs with the aptamer inserted at different points in the retron scaffold. They only observe binding for one scaffold in vitro, but achieve fluorescence enhancement for all four scaffolds in bacterial cells. These results demonstrate that aptamer performance can be very different in these two contexts.

      Strengths:

      -Given the importance of FLAPs for use in cellular imaging and the fact that these are typically evolved in vitro, understanding the difference in performance between a buffer and a cellular environment is an important research question.

      -The return strategy utilized by the authors is thoughtful and well-described.

      -The observation that some aptamers fail to show binding in vitro but do show enhancement in cells is interesting and surprising.

      Weaknesses:

      -This study hints toward an interesting observation, but would benefit from greater depth to more fully understand this phenomenon. Particularly challenging is that FLAP performance is measured in vitro by affinity and in cells by enhancement, and these may not be directly proportional. For example, it may be that some constructs have much lower affinity but a greater enhancement and this is the explanation for the seemingly different performance.

      -The authors only test enhancement at one concentration of fluorophore in cells (and this experimental detail is difficult to find and would be helpful to include in the figure legend). This limits the conclusions that can be drawn from the data and limits utility for other researchers aiming to use these constructs.

      -The FLAP that is used seems to have a relatively low fluorescence enhancement of only 2-3 fold in cells. It would be interesting to know if this is also the case in vitro. This is lower than typical FLAPs and it would be helpful for the authors to comment on what level of enhancement is needed for the FLAP to be of practical use for cellular imaging.

    1. eLife assessment

      The authors have developed a valuable approach that employs cell-free expression to reconstitute ion channels into giant unilamellar vesicles for biophysical characterisation. The work is solid and will be of particular interest to those studying ion channels that primarily occur in organelles and are therefore not amenable to be studied by more traditional methods.

    2. Reviewer #1 (Public Review):

      Summary:

      The authors have developed a valuable method based on a fully cell-free system to express a channel protein and integrate it into a membrane vesicle in order to characterize it biophysically. The study presents a useful alternative to study channels that are not amenable to being studied by more traditional methods.

      Strengths:

      The evidence supporting the claims of the authors is solid and convincing. The method will be of interest to researchers working on ionic channels, allowing them to study a wide range of ion channel functions such as those involved in transport, interaction with lipids, or pharmacology.

      Weaknesses:

      The inclusion of a mechanistic interpretation of how the channel protein folds into a protomer or a tetramer to become functional in the membrane would strengthen the study.

    3. Reviewer #2 (Public Review):

      It is challenging to study the biophysical properties of organelle channels using conventional electrophysiology. The conventional reconstitution methods require multiple steps and can be contaminated by endogenous ionophores from the host cell lines after purification. To overcome this challenge, in this manuscript, Larmore et al. described a fully synthetic method to assay the functional properties of the TRPP channel family. The TRPP channels are an important organelle ion channel family that natively traffic to primary cilia and ER organelles. The authors utilized cell-free protein expression and reconstitution of the synthetic channel protein into giant unilamellar vesicles (GUV), the single channel properties can be measured using voltage-clamp electrophysiology. Using this innovative method, the authors characterized their membrane integration, orientation, and conductance, comparing the results to those of endogenous channels. The manuscript is well-written and may present broad interest to the ion channel community studying organelle ion channels. Particularly because of the challenges of patching native cilia cells, the functional characterization is highly concentrated in very few labs. This method may provide an alternative approach to investigate other channels resistant to biophysical analysis and pharmacological characterization.

    1. eLife assessment

      In this valuable study, Huffer et al posit that non-cold sensing members of the TRPM subfamily of ion channels (e.g., TRPM2, TRPM4, TRPM5) contain a binding pocket for icilin that overlaps with the one found in the cold-activated TRPM8 channel. By examining a body of TRP channel cryo-EM structures to identify the conserved site, this study presents convincing electrophysiological evidence supporting the identification of an icilin binding pocket within TRPM4. This study shows that icilin has modulatory effects on the TRPM4 channel and will be of direct interest to those working in the TRP-channel field, but it also has implications for studies of somatosensation, taste, as well as pharmacological targeting of the TRPM subfamily.

    2. Reviewer #1 (Public Review):

      In this important study, Huffer et al posit that non-cold sensing members of the TRPM subfamily of ion channels (e.g., TRPM2, TRPM4, TRPM5) contain a binding pocket for icilin which overlaps with the one found in the cold-activated TRPM8 channel.

      The authors identify the residues involved in icilin binding by analyzing the existing TRPM8-icilin complex structures and then use their previously published approach of structure-based sequence comparison to compare the icilin binding residues in TRPM8 to other TRPM channels. This approach uncovered that the residues are conserved in a number of TRPM members: TRPM2, TRPM4, and TRPM5. The authors focus on TRPM4, with the rationale that it has the simplest activation properties (a single Ca2+-binding site). Electrophysiological studies show that icilin by itself does not activate TRPM4, but it strongly potentiates the Ca2+ activation of TRPM4, and introducing the A867G mutation (the mutation that renders avian TRPM8 sensitive to icilin) further increases the potentiating effects of the compound. Conversely, the mutation of a residue that likely directly interacts with icilin in the binding pocket, R901H, results in channels whose Ca2+ sensitivity is not potentiated by icilin.

      The data indicate that, just like in TRPV channels, the binding pockets and allosteric networks might be conserved in the TRPM subfamily.

      The data are convincing, and the authors employ good experimental controls.

    3. Reviewer #2 (Public Review):

      Summary:

      The authors set out to study whether the cooling agent binding site in TRPM8, which is located between the S1-S4 and the TRP domain, is conserved within the TRPM family of ion channels. They specifically chose the TRPM4 channel as the model system, which is directly activated by intracellular Ca2+. Using electrophysiology, the authors characterized and compared the Ca2+ sensitivity and the voltage dependence of TRPM4 channels in the absence and presence of synthetic cooling agonist icilin. They also analyzed the mutational effects of residues (A867G and R901H; equivalent mutations in TRPM8 were shown involved in icilin sensitivity) on Ca2+ sensitivity and voltage-dependence of TRPM4 in the absence and presence of Ca2+. Based on the results as well as structure/sequence alignment, the authors concluded that icilin likely binds to the same pocket in TRPM4 and suggested that this cooling agonist binding pocket is conserved in TRPM channels.

      Strengths:

      The authors gave a very thorough introduction to the TRPM channels. They have nicely characterized the Ca2+ sensitivity and the voltage-dependence of TRPM4 channels and demonstrated icilin potentiates the Ca2+ sensitivity and diminishes the outward rectification of TRPM4. These results indicate icilin modulates TRPM4 activation by Ca2+.

      Weaknesses:

      The reviewer has a few concerns. First, icilin alone (at 25µM) and in the absence of Ca2+ does not activate the TRPM4 channel. Have the authors titrated a wide range of icilin concentrations (without Ca2+ present) for TRPM4 activation? It raises the question that whether icilin is indeed an agonist for TRPM4 channel. This has not been tested so it is unclear. One may argue that icilin needs Ca2+ as a co-factor for channel activation just like in TRPM8 channel. This leads to the second concern, which is a complication in the experimental design and data interpretation. TRPM4 itself requires Ca2+ for activation to begin with, thus it is hard to dissect whether the current observed here for TRPM4 is activated by Ca2+ or by icilin plus its cofactor Ca2+. This is the difference between TRPM8 and TRPM4, as TRPM8 itself is not activated by Ca2+, thus TRPM8 activation is through icilin and Ca2+ acts as a prerequisite for icilin activation.

      The results presented in this study are only sufficient to show that icilin modulates the Ca2+-dependent activation of TRPM4 and icilin at best may act as an allosteric modulator for TRPM4 function. One cannot conclude from the current work that icilin is an agonist or even specifically a cooling agonist for TRPM4. Icilin is a cooling agonist for TRPM8, but it does not mean that if icilin modulates TRPM4 activity then it serves as a cooling agonist for TRPM4.

      For the mutation data on A867G, Figure 4A-B, left panels, it looks like A867G has stronger Ca2+ sensitivity compared to the WT in the absence of icilin and the onset of current activation is faster than the WT, or this is simply due to the scale of the data figure are different between A867G and the WT. Overall the mutagenesis data are weak to support the conclusion that icilin binds to the S1-S4 pocket. The authors need to mutate more residues that are involved in direct interaction with icilin based on the available structural information, including but limited to residues equivalent to Y745 and H845 in human TRPM8.

      The authors set out to study the conservation of the cooling agonist binding site in TRPM family, but only tested a synthetic cooling agonist icilin on TRPM4. In order to draw a broad conclusion as the title and the discussion have claimed, the authors need to more cooling compounds, including the most well-known natural cooling agonist menthol, and other cooling agonists such as WS-12 and/or C3, and test their effects on several TRPM channels, not just TRPM4. With the current data, the authors need to significantly tone down the claim of a conserved cooling agonist binding pocket in the TRPM family.

      On page 11, the authors suggest based on the current data, that TRPM2 and TRPM5 may also be sensitive to cooling agonists because the key residues are conserved. TRPM2 is the closest homolog to TRPM8 but is menthol-insensitive. There are studies that attempted to convert menthol sensitivity to TRPM2, for example, Bandell 2006 attempted to introduce S2 and TRP domains from TRPM8 into TRPM2 but failed to make TRPM2 a menthol-sensitive channel. The sequence conservation or structural similarity is not sufficient for the authors to suggest a shared cooling agonist sensitivity or even a common binding site in the TRPM2 and TRPM5 channels. Again, as pointed out above, the authors need to establish the actual activation of other TRPM channels by these agonists first, before proceeding to functionally probe whether other TRPM channels adopt a conserved agonist binding site.

      Taken together, this current work presents data to show the modulatory effects of icilin on the Ca2+ dependent activation and voltage dependence of the TRPM4 channel.

    4. Reviewer #3 (Public Review):

      Summary:

      The family of transient receptor potential (TRP) channels are tetrameric cation selective channels that are modulated by a variety of stimuli, most notably temperature. In particular, the Transient receptor potential Melastatin subfamily member 8 (TRPM8) is activated by noxious cold and other cooling agents such as menthol and icilin and participates in cold somatosensation in humans. The abundance of TRP channel structural data that has been published in the past decade demonstrates clear architectural conservation within the ion channel family. This suggests the potential for unifying mechanisms of gating despite their varied modes of regulation, which are not yet understood. To address this question, the authors examine the 264 structures of TRP channels determined to date and observe a potential binding pocket for icilin in multiple members of the Melastatin subfamily, TRPM2, TRPM4, and TRPM5. Interestingly, none of the other Melastatin subfamily members had been shown to be sensitive to icilin apart from TRPM8. Each of these channels is activated by intracellular calcium (Ca2+) and a Ca2+ binding site neighbors the predicted pocket for icilin binding in all cryo-EM structures. The authors examined whether icilin could modulate the activation of TRPM4 in the presence of intracellular Ca2+. The addition of icilin enhances Ca2+-dependent activation of TRPM4, promotes channel opening at negative membrane potentials, and improves the kinetics of opening. Furthermore, mutagenesis of TRPM4 residues within the putative icilin binding pocket predicted to enhance or diminish TRPM4 activity elicit these behaviors. Overall, this study furthers our understanding of the Melastatin subfamily of TRP channel gating and demonstrates that a conserved binding pocket observed between TRPM4 and TRPM8 channel structures can function similarly to regulate channel gating.

      Strengths:

      This is a simple and elegant study capitalizing on a vast amount of high-resolution structural information from the TRP channel of ion channels to identify a conserved binding pocket that was previously unknown in the Melastatin subfamily, which is interrogated by the authors through careful electrophysiology and mutagenesis studies.

      Weaknesses:

      No weaknesses were identified by this reviewer.

    1. eLife assessment

      This fundamental work provides new mechanistic insight into the regulation of PDGF signaling through splicing controls. The evidence is compelling to demonstrate the involvement of Srsf3, an RNA-binding protein, in this new mechanism. The work will be of broad interest to developmental biologists in general and molecular biologists/biochemists in the field of growth factor signaling and RNA splicing.

    2. Reviewer #1 (Public Review):

      In their manuscript "PDGFRRa signaling regulates Srsf3 transcript binding to affect PI3K signaling and endosomal trafficking" Forman and colleagues use iMEPM cells to characterize the effects of PDGF signaling on alternative splicing. They first perform RNA-seq using a one-hour stimulation with Pdgf-AA in control and Srsf3 knockdown cells. While Srsf3 manipulation results in a sizeable number of DE genes, PDGF does not. They then turn to examine alternative splicing, due to findings from this lab. They find that both PDGF and Srsf3 contribute much more to splicing than transcription. They find that the vast majority of PDGF-mediated alternative splicing depends upon Srsf3 activity and that skipped exons are the most common events with PDGF stimulation typically promoting exon skipping in the presence of Srsf3. They used eCLIP to identify RNA regions bound to Srsf3. Under both PDGF conditions, the majority of peaks were in exons with +PDGF having a substantially greater number of these peaks. Interestingly, they find differential enrichment of sequence motifs and GC content in stimulated versus unstimulated cells. They examine 2 transcripts encoding PI3K pathway (enriched in their GO analysis) members: Becn1 and Wdr81. They then go on to examine PDGFRRa and Rab5, an endosomal marker, colocalization. They propose a model in which Srsf3 functions downstream of PDGFRRa signaling to, in part, regulate PDGFRa trafficking to the endosome. The findings are novel and shed light on the mechanisms of PDGF signaling and will be broadly of interest. This lab previously identified the importance of PDGF naling on alternative splicing. The combination of RNA-seq and eCLIP is an exceptional way to comprehensively analyze this effect. The results will be of great utility to those studying PDGF signaling or neural crest biology. There are some concerns that should be considered, however.

      (1) It took some time to make sense of the number of DE genes across the results section and Figure 1. The authors give the total number of DE genes across Srsf3 control and loss conditions as 1,629 with 1,042 of them overlapping across Pdgf treatment. If the authors would add verbiage to the point that this leaves 1,108 unique genes in the dataset, then the numbers in Figure 1D would instantly make sense. The same applies to PDGF in Figure 1F and the Venn diagrams in Figure 2.

      (2) The percentage of skipped exons in the +PSI on the righthand side of Figure 2F is not readable.

      (3) It would be useful to have more information regarding the motif enrichment in Figure 3. What is the extent of enrichment? The authors should also provide a more complete list of enriched motifs, perhaps as a supplement.

      (4) It is unclear what subset of transcripts represent the "overlapping datasets" on lines 280-315. The authors state that there are 149 unique overlapping transcripts, but the Venn diagram shows 270. Also, it seems that the most interesting transcripts are the 233 that show alternative splicing and are bound by Srsf3. Would the results shown in Figure 5 change if the authors focused on these transcripts?

      (5) In general, there is little validation of the sequencing results, performing qPCR on Arhgap12 and Cep55. The authors should additionally validate the PI3K pathway members that they analyze. Related, is Becn1 expression downregulated in the absence of Srsf3, as would be predicted if it is undergoing NMD?

      (6) What is the alternative splicing event for Acap3?

      (7) The insets in Figure 6 C"-H" are useful but difficult to see due to their small size. Perhaps these could be made as their own figure panels.

      (8) In Figure 6A, it is not clear which groups have statistically significant differences. A clearer visualization system should be used.

      (9) Similarly in Figure 6B, is 15 vs 60 minutes in the shSrsf3 group the only significant difference? Is there a difference between scramble and shSrsf3 at 15 minutes? Is there a difference between 0 and 15 minutes for either group?

    3. Reviewer #2 (Public Review):

      Summary:

      This manuscript builds upon the work of a previous study published by the group (Dennison, 2021) to further elucidate the coregulatory axis of Srsf3 and PDGFRa on craniofacial development. The authors in this study investigated the molecular mechanisms by which PDGFRa signaling activates the RNA-binding protein Srsf3 to regulate alternative splicing (AS) and gene expression (GE) necessary for craniofacial development. PDGFRa signaling-mediated Srsf3 phosphorylation drives its translocation into the nucleus and affects binding affinity to different proteins and RNA, but the exact molecular mechanisms were not known. The authors performed RNA sequencing on immortalized mouse embryonic mesenchyme (MEPM) cells treated with shRNA targeting 3' UTR of Srsf3 or scramble shRNA (to probe AS and DE events that are Srsf3 dependent) and with and without PDGF-AA ligand treatment (to probe AS and DE events that are PDGFRa signaling dependent). They found that PDGFRa signaling has more effect on AS than on DE. A matching eCLIP-seq experiment was performed to investigate how Srsf3 binding sites change with and without PDGFRa signaling.

      Strengths:

      (1) The work builds well upon the previous data and the authors employ a variety of appropriate techniques to answer their research questions.

      (2) The authors show that Srsf3 binding pattern within the transcript as well as binding motifs change significantly upon PDGFRa signaling, providing a mechanistic explanation for the significant changes in AS.

      (3) By combining RNA-seq and eCLIP datasets together, the authors identified a list of genes that are directly bound by Srsf3 and undergo changes in GE and/or AS. Two examples are Becn1 and Wdr81, which are involved in early endosomal trafficking.

      Weaknesses:

      (1) The authors identify two genes whose AS are directly regulated by Srsf3 and involved in endosomal trafficking; however, they do not validate the differential AS results and whether changes in these genes can affect endosomal trafficking. In Figure 6, they show that PDGFRa signaling is involved in endosome size and Rab5 colocalization, but do not show how Srsf3 and the two genes are involved.

      (2) The proposed model does not account for other proteins mediating the activation of Srsf3 after Akt phosphorylation. How do we know this is a direct effect (and not a secondary or tertiary effect)?

    1. eLife assessment

      This study provides valuable insights into the influence of sex on bile acid metabolism and the risk of hepatocellular carcinoma (HCC). The data to support that there are inter-relationships between sex, bile acids, and HCC in mice are solid, but for the most part, they are descriptive. At this point, there is not enough evidence to determine the clinical significance of the findings, given the differences in bile acid composition between mice and men.

    2. Reviewer #1 (Public Review):

      Summary:

      Liver cancer shows a higher incidence in males than females with incompletely understood causes. This study utilized a mouse model that lacks the bile acid feedback mechanisms (FXR/SHP DKO mice) to study how dysregulation of bile acid homeostasis and a high circulating bile acid may underlie the gender-dependent prevalence and prognosis of HCC. By transcriptomics analysis comparing male and female mice, unique sets of gene signatures were identified and correlated with HCC outcomes in human patients. The study showed that the ovariectomy procedure increased HCC incidence in female FXR/SHP DKO mice that were otherwise resistant to age-dependent HCC development and that removing bile acids by blocking intestine bile acid absorption reduced HCC progression in FXR/SHP DKO mice. Based on these findings, the authors suggest that gender-dependent bile acid metabolism may play a role in the male-dominant HCC incidence, and that reducing bile acid levels and signaling may be beneficial in HCC treatment.

      Strengths:

      (1) Chronic liver diseases often preceed the development of liver and bile duct cancer. Advanced chronic liver diseases are often associated with dysregulation of bile acid homeostasis and cholestasis. This study takes advantage of a unique FXR/SHP DKO model that develops high organ bile acid exposure and spontaneous age-dependent HCC development in males but not females to identify unique HCC-associated gene signatures. The study showed that the unique gene signature in female DKO mice that had lower HCC incidence also correlated with lower-grade HCC and better survival in human HCC patients.

      (2) The study also suggests that differentially regulated bile acid signaling or gender-dependent response to altered bile acids may contribute to gender-dependent susceptibility to HCC development and/or progression.

      Weaknesses:

      (1) HCC shows heterogeneity, and it is unclear what tissues (tumor or normal) were used from the DKO mice and human HCC gene expression dataset to obtain the gene signature, and how the authors reconcile these gene signatures with HCC prognosis.

      (2) The authors identified a unique set of gene expression signatures that are linked to HCC patient outcomes, but analysis of these gene sets to understand the causes of cancer promotion is still lacking. The studies of urea cycle metabolism and estrogen signaling were preliminary and inconclusive. These mechanistic aspects may be followed up in revision or future studies.

      (3) While high levels of bile acids are convincingly shown to promote HCC progression, their role in HCC initiation is not established. The DKO model may be limited to conditions of extremely high levels of organ bile acid exposure. The DKO mice do not model the human population of HCC patients with various etiology and shared liver pathology (i.e. cirrhosis). Therefore, high circulating bile acids may not fully explain the male prevalence of HCC incidence.

      (4) The authors showed lower circulating bile acids and increased fecal bile acid excretion in female mice and hypothesized that this may be a mechanism underlying the lower bile acid exposure that contributed to lower HCC incidence in female DKO mice. Additional analysis of organ bile acids within the enterohepatic circulation may be performed because a more accurate interpretation of the circulating bile acids and fecal bile acids can be made in reference to organ bile acids and total bile acid pool changes in these mice.

    3. Reviewer #2 (Public Review):

      Summary:

      The manuscript of Patton et al. shows that in mice in which both FXR and SHP are knocked out, the sex difference in liver cancer risk is recapitulated. Authors show that the protection against tumor development seen in female mice is dependent upon ovarian hormone secretion and higher fecal bile acid excretion in females compared to males. The female liver-specific gene signature correlates with low-grade tumors and better survival in human HCC patients.

      The combination of the use of the double knockout mice together with ovariectomy in female mice and using a bile acid raisin in male mice to underscore their conclusion is strong. However, there are also some shortcomings, that should be addressed.

      Strengths:

      (1) Using computational modelling, Patton and colleagues correlate mouse DKO transcriptome data to the clinical outcomes of HCC patients using HCC transcriptome datasets.

      (2) The dependence of female protection on ovarian hormones and increased fecal bile acid excretion is nicely shown by combining ovariectomy and bile acid raisin with the use of double knockout mice.

      Weaknesses:

      (1) The translational value to human HCC is not so strong yet. Authors show that there is a correlation between the female-selective gene signature and low-grade tumors and better survival in HCC patients overall. However, these data do not show whether this signature is more highly correlated with female tumor burden and survival. In other words, whether the mechanisms of female protection may be similar between humans and mice. In that respect, it would also be good to elaborate on whether women have higher fecal BA excretion and lower serum BA concentration.

      (2) The authors should perform a thorough spelling and grammar check.

      (3) There are quite some errors and inaccuracies in the result section, figures, and legends. The authors should correct this.

    1. eLife assessment

      The findings are useful for understanding the disease's pathology and immune dysregulation, but the evidence is still incomplete regarding whether these immune changes are directly caused by copper metabolism alterations or are secondary to liver dysfunction.

    2. Reviewer #1 (Public Review):

      Summary:

      Wilson's Disease (WD) is an inherited rare pathological condition due to a mutation in ATP7B that alters mitochondrial structure and dysfunction. Additionally, WD results in dysregulated copper metabolism in patients. These metabolic abnormalities affect the functions of the liver and can result in cholecystitis. Understanding the immune component and its contribution to WD and cholecystitis has been challenging. In this work, the authors have performed single-cell RNA sequencing of mesenchymal tissue from three WD patients and three liver hemangioma patients.

      Strengths:

      The authors describe the transcriptomic alterations in myeloid and lymphoid compartments.

      Weaknesses:

      In brief, this manuscript lacks a clear focus, and the writing needs vast improvement. Figures lack details (or are misrepresented), the results section only catalogs observations, and the discussion needs to focus on their findings' mechanistic and functional relevance. The major weakness of this manuscript is that the authors do not provide a mechanistic link between the absence of ATP7B and NK cells' impaired/altered functions. While the work is of high clinical relevance, there are various areas that could be improved.

    3. Reviewer #2 (Public Review):

      Summary:

      Wilson's disease is a rare genetic disorder caused by mutations in the ATP7B gene. Previous studies have documented that ATP7B mutations can disrupt copper metabolism, affecting brain and liver function. In this paper, the authors performed a retrospective clinical study and found that Wilson's disease has a high incidence of cholecystitis. Single-cell RNA-seq analysis revealed changes in the immune microenvironment, including the activation of immune responses and the exhaustion of natural killer cells.

      Strengths:

      A key finding of this study is that the predominant ATP7B gene mutation in the Chinese population is the 2333G>T (p. R778L) mutation. The authors reported associations between Wilson's disease and cholecystitis, as well as the exhaustion of natural killer cells.

      Weaknesses:

      The underlying mechanisms linking ATP7B mutations to cholecystitis and natural killer cell exhaustion remain unclear. Specifically, it is not yet determined whether copper metabolism alterations directly cause cholecystitis and natural killer cell exhaustion, or if these effects are secondary to liver dysfunction.

    1. eLife assessment

      This study investigates BMP signaling mechanisms in the developing chick cerebellum to better understand germinal layer formation, cellular amplification and neuronal differentiation. The data from human tissue is compelling and lends support to the possible links of these processes to medulloblastoma, although this study does raise exciting questions regarding the generalized role of BMP signaling during normal development and malignant growth. Overall, this is an important study with beautifully presented findings.

    2. Reviewer #1 (Public Review):

      Summary:

      Rook et al examined the role of BMP signaling in cerebellum development, using chick as a model alongside human tissue samples. They first examined p-SMADs and found differences between the species, with human samples retaining high p-SMAD after foliation, while in chick, BMP signaling appears to decrease following foliation. To understand the role of BMP during early development, they then used early chick embryos to modulate BMP, using either a constitutively active BMP regulator to increase BMP signaling or overexpressing the negative intracellular BMP regulator to decrease BMP signaling. After validating the constructs in ovo, the authors then examined GNP morphology and migration. They then determined whether the effects were cell autonomous.

      Strengths:

      The experiments were well-designed and well-controlled. The figures were extremely clear and convincing, and the accompanying drawings help orient the reader to easily understand the experimental set up. These studies also help clarify the role of BMP at different stages of cerebellum development, suggesting early BMP signaling is required for dorsalization, not rhombic lip induction, and that later BMP signaling is needed to regulate the timing of migration and maturation of granule neurons.

      Weaknesses:

      While these studies certainly hint that BMP modulation may affect tumor growth, this was not explicitly tested here. Future studies are required to generalize the functional role of BMP signaling in normal cerebellum development to malignant growth.

    3. Reviewer #2 (Public Review):

      Summary:

      This is a fundamental and elegant study showing the role of BMP signaling in cerebellar development. This is an important question because there are multiple diseases, including aggressive childhood cancers, which involve granule cell precursors. Thus understanding of the factors that govern the formation of the granule cell layer is important both from a basic science and a disease perspective.

      Overall, the manuscript is clear and well-written. The figures are extremely clear, wonderfully informative, and overall quite beautiful.

      Figures 1-3 show the experimental design and report how BMP activity is altered over development in both the chick and the human developing cerebellum. Both data is very impressive and convincing.

      They then go on to modulate BMP activity in the developing chick, using a complex electroporation paradigm that allows them to label cells with GFP as well as with cell-specific reporters of BMP activity levels. They bidirectionally modulate BMP levels and then can look at both cell-specific and non-specific alterations in the formation of the external and internal granule cell layer, across different developmental timepoints. These are really elegant and rigorous experiments, as they look at both sagittal and transverse sections to collect this data. This makes the data extremely compelling. With these rigorous techniques, they show that BMP signaling serves more than one function across development: it is involved in the initial tangential migration from the rhombic lip, but at a later time, both up- and down-regulation of BMP activity reduces density of amplifying cells in the external granule cell layer.

      Strengths:

      Overall, I think the paper is interesting and important and the data is strong. The use of both chick and human tissue strengthens the findings. They are extremely rigorous, analyzing data from multiple planes at multiple ages, which also really strengthens their findings. The dual electroporation approach is extremely elegant, providing beautiful visual representations of their findings.

      Weaknesses:

      I find no significant weaknesses.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      (1.1) I thought the manuscript was very clear. While I realize the authors included the reference to medulloblastoma in the introduction based on previous reviewer comments, I think this speculation is better left in the discussion.

      Whilst we appreciate the reviewers feedback here, we felt it was important to include a reference to medulloblastoma and developmental disorders associated with the cerebellum to put this work into a broader context.

      We removed the sentence “Medulloblastoma can be a consequence of uncontrolled proliferation of granule cell progenitors, with BMP overexpression being a potential therapeutic avenue to inhibit this proliferation” to limit the speculation in this statement.

      (1.2) line 81: It would be better to cite the 2 original papers (Hendrikes et al 2022, Smith et al 2022) rather than the Phoenix commentary article. I'm not sure the Phoenix article needs to be cited at all within this paper.

      We have cited the two suggested papers and removed the citation to Phoenix et al.

      (1.3) line 102: confusing sentence with the unexpected separation of do and not: "the same conditional deletions of BMP pathway elements that fail to block early granule cell specification at the rhombic lip do result not in a larger cerebellum as might be expected, but either have no affect".

      We thank the reviewer for pointing out this error and have corrected the text to “do not result in a larger cerebellum”.

      (1.4) line 133: inconsistent acronyms (for example, W9 vs pcw9).

      This has been corrected to PCW in all occurrences.

      (1.5) line 139: coronal vs transverse? it seems like you show transverse sectioning but refer to it as coronal in the text.

      We thank the reviewer for highlighting this and have corrected the text to “transverse”.

      (1.6) fig 2C: would it be possible to provide a similar inset as 2D?

      We thank the reviewer for this suggestion and have added the insets in 2C. We agree that this is now clearer and more consistent with the rest of the figure.

      (1.7) line 368/369/435/436 missing arrows.

      The arrows have been re-added- it appears that they did not show up on the uploaded PDF.

      (1.8) line 517 missing word: rhombic-lip-derived.

      This typo has been corrected.

      Reviewer #2 (Public Review):

      (2.1) Fig. 3 M Why are there asterisks both above and below the brackets?

      This was a formatting error that has now been corrected.

      (2.2) Fig. 8. The arrows (BMP up and BMP down) are touching the right ")" in the figure, which makes it hard to read.

      This was also a formatting issue which has been corrected.

      (2.3) Fig. 4 and 8 legends. There are spaces in the text which I believe are for arrows to be inserted "(BMP )", but the arrows have been omitted in the PDF that I read.

      This is the same as reviewer 1’s comment- these have been re-added to the text and appears to have been an issue with the PDF upload.

      (2.4) Fig. 3 legend gets very hard to read at the end, where it seems some punctuation is missing.

      We have re-worded the legend for Fig. 3 to make it easier to read.

      (2.5) Significant figures in some of the text are probably too much given the accuracy at which they can be measured with.

      We appreciate the reviewer’s concerns here, however these were added in response to the original reviewer’s request to “provide some additional support to otherwise qualitative observations”.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      More details should be provided in terms of inclusion and exclusion criteria for the participants, as well as missing data due to the non-cooperation of newborns during the experimental process. Potential differences between preterm and full-term infants are worth exploring. Several aspects of EEG data analyses and data interpretation should be better clarified.

      Here I have several comments and questions to improve the manuscript.

      (1) It would be wise to know whether there was any missing data due to the non-cooperation of newborns during the experimental process.

      Thank you for the suggestion. While our initial aim was to include 120 neonates in the final data analysis, we actually recruited 198 neonatal participants for this study. The 78 EEG datasets were excluded from the data analysis due to non-cooperation of neonates (n = 75) or technical issues (n = 3). We have incorporated this detailed information in the Subjects subsection (lines 375-383) in the revised manuscript.

      (2) The authors investigated the impact of gestational age on emotional perceptual sensitivity in newborns by grouping infants of varying gestational ages in the experiment. The methods section mentions that the study conducted experiments within 24 hours after the birth of the newborns. When do preterm infants (with a gestational age of 35 and 36 weeks) begin to exhibit emotional discrimination comparable to full-term newborns? 

      This is indeed an intriguing question that merits exploration. However, in our study, we recruited relatively healthy preterm neonates, many of whom were discharged from the hospital with their mothers within 3-5 days after birth. It would have been challenging to arrange for another EEG testing session once these preterm infants reached full-term age, as their parents were unwilling to return to the hospital.

      (3) When analyzing EEG data, excluding artifacts with peak deviations exceeding ±200 μV is a relatively lenient criterion, potentially resulting in the retention of some large-amplitude artifacts or noise. What is the rationale behind the author's choice of this criterion? Or, in other words, what considerations led to this specific selection?

      In our standard practice, we typically employ a stricter threshold of ±100 μV for artifact removal in studies involving healthy adults and a median threshold of ±150 μV for data from adult patients, such as those with schizophrenia. However, when analyzing neonatal data, we often resort to the loosest criterion of ±200 μV. This decision is primarily due to the inherent challenges associated with neonatal EEG recordings, as we cannot expect newborns to cooperate or remain quiet during the recording process. Consequently, neonatal EEG data tend to contain more artifacts compared to those from healthy adults. Furthermore, the excitability of the newborn brain is notably elevated. This heightened excitability arises from an imbalance in the distribution and function of excitatory and inhibitory neurotransmitter systems. Typically, the expression of excitatory neurotransmitters and their receptors surpasses that of inhibitory neurotransmitters, resulting in increased excitability in the immature brain. This heightened excitability can occasionally lead to the occurrence of paroxysmal electrical activity. As a result, neonatal EEG recordings may at times display large amplitudes, exceeding even 100 μV. In this revision, we have referenced other neonatal/infant EEG studies or technique pipelines that have used the threshold of ±200 μV to support this criterion (lines 483-484).    

      (4) In the Discussion section, the authors mentioned the biomarkers, such as the fusiform gyrus and hippocampus, which have been identified as potential predictors of autism risk. It is suggested that the authors briefly elucidate the crucial role of these biomarkers in processing social information, which would enhance the readability and logicality of this manuscript.

      Thank you for the thoughtful suggestion. We have expanded the discussion concerning the involvement of the fusiform gyrus and hippocampus in social information processing (lines 314-319).

      Reviewer #2 (Public Review):

      First, readers need to see spectrograms that show the 0-4000 Hz in more detail, rather than what is now shown (0-10,000 Hz). The vocal signals in clearer spectrograms will show I believe the initial consonant burst and formant frequencies that are unique to human speech and give rise to the perception of the consonant sounds in the vocal signals like 'dada' and 'tutu' that were tested. The control signals will presumably not show these abrupt acoustic changes at their onset, even though they appear (from the oscillograms) to approximate the amplitude envelope. The primary cue distinguishing the happy and neutral signals in both the vocal and control signals is the pitch of the signals (high vs low), but the burst of energy representing the consonants is only contained in the vocal signals; it has no comparable match in the control signals. It is possible that the presence of a sharp acoustic onset (a unique characteristic of consonants in human speech) is especially alerting to the infants, and that this acoustic cue, in the context of the pitch change, enhances discrimination in the vocal case. One way to test this would be to use only vowel sounds to represent the vocal signals, without consonants.

      Thank you for your expert comments and considerations. We have redrawn Figure 3 using Praat software with a frequency range of 0-5000 Hz, as suggested by Praat’s default parameters. Based on the spectrograms, we acknowledge the potential role of consonants in accounting for differences in stimuli. Consequently, we have included this consideration as one of the limitations of our study in this revised version (lines 325-330).

      Another critical detail that the authors need to include about the signals is an explanation of how the control signals were generated. The text states that the Fo and amplitude envelope of the vocal signals were mimicked in the control signals, but what was the signal used for the controls? Was a pure tone complex modulated, or was pink noise used to generate the control signals? Or were the original vocal signals simply filtered in some way to create the controls, which would preserve the Fo and amplitude envelope? If merely filtered, the control signals still may be perceived as 'vocal' signals, rather than as nonspeech (the Supplement contains the sounds, and some of the control sounds can be perceived, to my ear, as 'vocal' signals).

      We sincerely appreciate your attention to detail regarding the generation of control signals. As a non-specialized laboratory in audio editing, our approach involved filtering the original vocal sounds around the fundamental frequency (f0) and ensuring a balanced mean intensity between vocal and nonvocal stimuli (as now stated in lines 432-437). However, it became evident that certain “vocal” components persisted in the control sounds, particularly noticeable in the sound “tutu”. In this revision, we openly acknowledge this oversight (lines 331-333). We extend our gratitude once again for highlighting the importance of meticulous consideration when generating control sounds for a study.

      Second, there is no information in the manuscript or supplement about the auditory environment of the participants, nor discussion of the fetus' ability to hear in the womb. In the womb, infants are listening to the mothers' bone-conducted speech (which is full of consonant sounds), and we know from published studies that infants can discern differences not only in the prosody of the speech they hear in the womb, but the phonetic characteristics of the mother's speech. The ability at 37 weeks GA or beyond to discriminate the pitch changes in the vocal, but not control signals, could thus be due to additional experience in utero to speech. Another experiential explanation is that the infants born at 37 weeks GA and beyond may be exposed to greater amounts of speech after birth, when compared to those born at 35 and 36 weeks GA, from the attending nurses and from their caregivers, and this speech is also full of consonant sounds. What these infants hear is likely to be 'infant-directed speech,' which is significantly higher in pitch, mirroring the signals tested here. At 37 weeks GA, infants are likely more robust, may sleep less, and are likely more alert. If infants' exposure to speech, either after birth, or their auditory ability to discern differences in speech in utero, is enhanced at 37 weeks GA and beyond, then an 'experience-related' explanation is a viable alternative to a maturational explanation, and should be discussed. Perhaps both are playing a role. As the authors state, many more signals need to be tested to discern how the effect should be interpreted, and other viable interpretations of the current results discussed.

      We acknowledge the importance of considering the auditory environment of participants and the fetus' ability to hear in the womb. In our study, neonates were exposed to a native language environment both before and after birth (as added in lines 385-386), and we took efforts to minimize their exposure to speech stimuli other than those used in the experiment. Specifically, all neonates participated the experiment and underwent EEG recording within the first 24 hours after birth (lines 386-387). They were promptly transported to a dedicated testing room for EEG recording as soon as their condition stabilized after birth. During recording sessions, they were separated from their mothers to minimize exposure to natural speech (as added in lines 459-461). As a result, we believe that both preterm and term neonates were exposed to comparable amounts of speech after birth and before the experiment. We also ensured that all participants were in a natural sleep state during EEG recording. However, it is possible that term neonates slept less and were more attentive to the limited speech stimuli in their environment before the experiment compared to preterm newborns.

      The debate surrounding nature versus nurture in neonate and infant development persists. We recognize the potential impact of prenatal auditory experiences on neonatal perceptual sensitivity. Therefore, we have added a brief discussion regarding innate- or experience-related explanations for emotional prosodic discrimination in neonates, aiming to shed light on future research directions (lines 343-351).

    2. eLife assessment

      This is an important study on changes in newborns' neural abilities to distinguish auditory signals at 37 weeks of gestation. The evidence of change in neural discrimination as a function of gestational age is convincing, but further analysis of the acoustic signals and control of the infants' language environment is necessary for the results to be used in clinical applications. The work contributes to the field of neurodevelopment.

    3. Reviewer #1 (Public Review):

      Summary:<br /> This manuscript aimed to investigate the emergence of emotional sensitivity and its relationship with gestational age. Using an oddball paradigm and event-related potentials, the authors conducted an experiment in 120 healthy neonates with a gestational age range of 35 to 40 weeks. A significant developmental milestone was identified at 37 weeks gestational age, marking a crucial juncture in neonatal emotional responsiveness.

      Strengths:<br /> This study has several strengths, by providing profound insights into the early development of social-emotional functioning and unveiling the role of gestational age in shaping neonatal perceptual abilities. The methodology of this study demonstrates rigor and well-controlled experimental design, particularly involving matched control sounds, which enhances the reliability of the research. Their findings not only contribute to the field of neurodevelopment, but also showcase potential clinical applications, especially in the context of autism screening and early intervention for neurodevelopmental disorders.

      Comments on the revised version:

      After reviewing the authors' response letter and the revised manuscript, I believe they have done a commendable job in addressing my comments.<br /> Additionally, I concur with the concerns raised by Reviewer #2 regarding several potential confounding factors that require better control in their experimental design. These include the differences in physical properties between vocal and nonvocal stimuli, as well as the infant's exposure to the speech/auditory environment. These concerns should be thoroughly and explicitly discussed in the manuscript, ensuring a clearer understanding for the readers.

    4. Reviewer #2 (Public Review):

      This is an important and very interesting report on a change in newborns' neural abilities to distinguish auditory signals as a function of the gestational age (GA) of the infant at birth (from 35 weeks GA to 40 weeks GA). The authors tested neural discrimination of sounds that were labeled 'happy' vs 'neutral' by listeners that represent two categories of sound, either human voices or auditory signals that mimic only certain properties of the human vocal signals. The finding is that a change occurs in neural discrimination of the happy and neutral auditory signals for infants born at or after 37 weeks of gestation, and not prior (at 35 or 36 weeks of gestation), and only for discrimination of the human vocal signals; no change occurs in discrimination of the nonhuman signals over the 35- to 40-week gestational ages tested. The neural evidence of discrimination of the vocal happy-neutral distinction and the absence of the discrimination of the control signals is convincing. The authors interpret this as a 'landmark' in infants' ability to detect changes in emotional vocal signals, and remark on the potential value of the test as a marker of the infants' interest in emotional signals, underscoring the fact that children at risk for autism spectrum disorder may not show the discrimination. Although the finding is novel and interesting, additional discussion is essential so that readers understand two potential caveats affecting this interpretation.

      Comments on the revised version:

      The revised manuscript does discuss the limitations of the control stimuli, as well as the limitations with regard to conclusions that can be drawn from this data set. I therefore expected the authors to temper a bit their recommendation that this could be a 'screening' signal for autism because these data are not sufficiently strong to make that recommendation. Also, in the same vein, perhaps the title might be adjusted somewhat to suggest less certainty, for example, by using the word "change" rather than "milestone"'? The data are of interest, but the limitations are genuine limitations.

    1. Author response:

      The following is the authors’ response to the previous reviews

      It is unclear to us why you did not adjust the title to better reflect the well-supported claims of the paper, i.e., that this is a valuable model for human loss-of-function mutations in IQCH.

      Thanks for the editor’s suggestion. We have changed the title to “Deficiency of IQCH causes male infertility in humans and mice.” Additionally, we have provided the original images of the gels or blots as a zipped folder.

    2. eLife assessment

      This valuable study describes mice with a knock out of the IQ motif-containing H (IQCH) gene, to model a human loss-of-function mutation in IQCH associated with male sterility. While the evidence for interaction between IQCH and potential RNA binding proteins is limited, the human infertility is reproduced in the mouse, making it a compelling model. The paper could be of interest to cell biologists and male reproductive biologists working on the sperm flagellar cytoskeleton and mitochondrial structure.

    3. Reviewer #3 (Public Review):

      In this study, Ruan et al. investigate the role of the IQCH gene in spermatogenesis, focusing on its interaction with calmodulin and its regulation of RNA-binding proteins. The authors examined sperm from a male infertility patient with an inherited IQCH mutation as well as Iqch CRISPR knockout mice. The authors found that both human and mouse sperm exhibited structural and morphogenetic defects in multiple structures, leading to reduced fertility in Ichq-knockout male mice. Molecular analyses such as mass spectrometry and immunoprecipitation indicated that RNA-binding proteins are likely targets of IQCH, with the authors focusing on the RNA-binding protein HNRPAB as a critical regulator of testicular mRNAs. The authors used in vitro cell culture models to demonstrate an interaction between IQCH and calmodulin, in addition to showing that this interaction via the IQ motif of IQCH is required for IQCH's function in promoting HNRPAB expression. In sum, the authors concluded that IQCH promotes male fertility by binding to calmodulin and controlling HNRPAB expression to regulate the expression of essential mRNAs for spermatogenesis. These findings provide new insight into molecular mechanisms underlying spermatogenesis and how important factors for sperm morphogenesis and function are regulated.

      The strengths of the study include the use of mouse and human samples, which demonstrate a likely relevance of the mouse model to humans; the use of multiple biochemical techniques to address the molecular mechanisms involved; the development of a new CRISPR mouse model; ample controls; and clearly displayed results. Assays are done rigorously and in a quantitative manner. Overall, the claims made by the authors in this manuscript are well-supported by the data provided.

    1. eLife assessment

      In this important study, the authors explore ER stress signaling mediated by ATF6 using a genome-wide gene depletion screen. They find that the ER chaperone Calreticulin binds and directly represses ATF6, a new and intriguing function for Calreticulin. The evidence presented is convincing, based on CHO genetics and biochemical analysis.

    2. Reviewer #1 (Public Review):

      Summary:

      In this manuscript, Tung and colleagues identify Calreticulin as a repressor of ATF6 signaling using a crispr screen and characterize the functional interaction between ATF6 and CALR.

      Strengths:

      The manuscript is well written and interesting with an innovative experimental design which provides some new mechanistic insight into ATF6 regulation as well as crosstalk with the IRE1 pathway. The methods used were fit for purpose and reasonable conclusions were drawn from the data presented.

      Comments on latest version:

      The authors did a good job at addressing my comments even though they found several aspects to exceed the scope of the work. The manuscript is clearer now and the model pushed by the authors is better supported by the data. One point I am curious about the authors' opinion would be about the status of ATF6alpha activation in pathological cells in which CALR is mutated (e.g., myeloproliferative neoplasms), although this neither challenges the conclusions of the manuscript and my positive opinion of the work.

    3. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      The authors explore ER stress signalling mediated by ATF6 using a genome-wide gene depletion screen. They find that the ER chaperone Calreticulin binds and directly represses ATF6; this proposed function for Calreticulin is intriguing and constitutes an important finding. The evidence presented is based on CHO genetic evidence and biochemical results and is convincing. 

      We thank the editors for their favourable assessment of our work.

      Reviewer #1 (Public Review): 

      Summary: 

      In this manuscript, Tung and colleagues identify Calreticulin as a repressor of ATF6 signalling using a CRISPR screen and characterize the functional interaction between ATF6 and CALR. 

      Strengths: 

      The manuscript is well written and interesting with an innovative experimental design that provides some new mechanistic insight into ATF6 regulation as well as crosstalk with the IRE1 pathway. The methods used were fit for purpose and reasonable conclusions were drawn from the data presented. Findings are novel and bring together glycoprotein quality control and activation of one sensor of the UPR. This is a novel perspective on how the integration of ER homeostasis signals could be sensed in the ER. 

      We thank the reviewer for their favourable assessment of our work.

      Weaknesses: 

      Several points remain to be documented to support the authors' model. 

      Major comments 

      (1) It is interesting that BiP, PDIs, and COPII are not identified in the screen. Might this indicate some bias in the system perhaps limiting its sensitivity or pleiotropic effects of the reporter? 

      The reviewer raises a valid concern. Our CRISPR screen aimed to identify genes that selectively modulate ATF6⍺. Therefore, we excluded from consideration genes whose inactivation had effects on the broader ER environment. This would disfavour the selection of genes encoding BiP, PDI and COPII components. Additionally, a positive selection screen inherently removes essential genes like BiP. The absence of COPII components among the hits could be due to essentiality or that those components are not strong selective modulators for ATF6⍺ activation, as the stronger ATF6⍺ modulators as S1P, S2P and transcription factor S2P and NFY were among our top hits. Cell type specificity may also play a role. For example, ERp18, a small PDI previously implicated in ATF6⍺ activation (Oka et al 2019; PMID: 31368601), despite the presence of sgRNAs targeting hamster ERp18 in the library. Interestingly, depletion of ERp18 in our dual UPR reporter CHO-K1 cell line did not affect the ATF6⍺ and IRE1⍺ UPR branches in CHO-K1 cells. This new information has been incorporated into the revised manuscript as Supplemental Figure S6E and the discussion has been edited in line with these comments.

      (2) CLR interacts with ATF6 independently of ATF6 glycans (and cysteines). How do the authors reconcile this observation with the lectin functions of CALR? What is the interaction mode then - if the CALR N (lectin) domain is not involved, is it the P domain that is responsible for the interaction? All the binding experiments are performed in the presence of 1 mM CaCl2, is calcium necessary for CALR to achieve binding? 

      These points merit clarification. The Biolayer Interferometry (BLI) assay reported on an interaction between ATF6 and CRT that is independently of ATF6⍺ glycans. However, cellbased experiments revealed a contribution of glycan-dependent interactions to the binding and repression. Therefore, we conclude that the interaction of CRT with ATF6⍺ likely involves both lectin-dependent and lectin-independent interactions (dependent on the P-domain). Indeed, this hybrid model has previously been suggested as the mode of stable interaction of CRT with other substrates, as cited in the discussion section (Wijeyesakere et al., 2013; PMID: 24100026). CRT is a known calcium-dependent protein, and all the in vitro experiments were conducted in the presence of 1 mM CaCl2. We do not have data from experiments without CaCl2.

      (3) Does the introduction of the reporter system affect the normal BiP (or ATF6) protein levels in the cells? 

      To address this question, we have conducted new experiments comparing endogenous BiP protein levels between the reporter-containing cells and the parental CHO-K1 cells using immunoblotting and an anti-BiP antibody. These data indicate that the reporter system does not affect to the endogenous BiP protein levels. This new information has been incorporated as revised Supplemental Figure S1C.

      (4) Does the depletion of CRT affect BiP interaction with ATF6? The absence of CRT may lead to misfolding of glycoproteins and titration of BiP away from ATF6 leading to activation. An indicator of ER stress levels that is independent of ATF6 and IRE1 might be useful. 

      To further assess ER stress levels in CRT-depleted cells, we compared expression levels of endogenous ER resident proteins containing a KDEL signal (e.g., P3H1, GRP94, BiP and PDI) in parental CHO-K1 cells, dual UPR reporter cell lines (XC45-6S) and CRT-depleted cells (CRT∆#2P) under basal conditions and during ER stress by immunoblotting. This comparison confirmed the basal elevation in BiP protein level in cells lacking CRT, consistent with previous findings (Figure 2D) and more broadly the integrity of UPR signalling in cells lacking CRT. In the interest of time, we did not extend the analysis to other branches of the UPR. This new information has been incorporated as Supplemental Figure S5 and in the text of the revised manuscript.

      (5) Does CALR depletion alter ATF6 redox status. 

      We thank the reviewer for raising this interesting point. In response, we compared ATF6⍺ redox status in parental and CRT-depleted cells using non-reducing SDS-PAGE. Overall, the redox pattern was similar in parental and CRT-depleted cells with the detection of two redox forms: an inter-chain disulfide-stabilised dimer and the monomer. Under basal conditions, ATF6⍺ predominantly existed as a monomer, while under ER stress, the monomer band decreased with a corresponding increase in a disulfide-stabilised dimer form in parental cells, as previously reported (Oka et al, 2022; PMID: 35286189). However, under ER stress, CRTdepleted cells showed a significantly higher fraction of monomer versus dimer compared to parental cells. Taking all together, these data suggest that the loss of CRT may favour the monomeric form of ATF6α, which is proposed to be more efficiently trafficked (Nadanaka, et al 2007; PMID: 17101776), aligning with our observations that CRT depletion is associated to constitutive activation of ATF6α. These new data have been included as Supplemental Figure S7 and are detailed explained in the results section of the revised manuscript.

      (6) Figure 4C would benefit from some immunoblotting against BiP.

      Although we acknowledge the validity of this suggestion and understand the referee's interest in comparing the amount of CRT in pulldown with that of BiP, the necessity of generating additional samples makes this experiment impractical. Consequently, we opted not to include in our conclusion any comparison regarding the retention of ATF6α by BiP relative to CRT.

      (7) Overlooked requirement of cysteines for ATF6 functionality (Figure 5B). 

      We interpret this comment to refer to the inactivity of the cysteine-free allele of ATF6⍺. Whilst this is a reproducible observation of significance to the structure-activity features of ATF6⍺’s luminal domain, it is less informative in terms of understanding trans-active regulators of ATF6⍺ and was therefore not explored further.

      (8) Without a clear definition of the role of CRT in ATF6 folding, one cannot infer that the observed phenotype is not based on defects in ATF6 "folding" and glycosylation considering the possibility of activation of newly synthesised un-glycosylated ATF6. 

      If the main role of CRT were to assist ATF6⍺ folding, one would expect that depletion of CRT would lead to a non-functional ATF6⍺, resulting in ER retention and less activity. However, our data indicate that the loss of CRT correlates with the constitutive activation of the ATF6⍺ fluorescent reporter and increased Golgi trafficking and processing of ATF6⍺. Therefore, these data suggest that in CRT-depleted cells, the majority of ATF6⍺ is likely to fold to a functional state.

      (9) ATF6 was defined in several studies as a natively unstable protein and shows a close relationship with the ERAD machinery, is the role of CALR also involved in a quality control mechanism for natively unfolded ATF6? 

      The reviewer brings up a valid point too. Although we have not closely evaluated the role of CRT in the quality control machinery, we observed that the loss of CRT was not associated with an increased levels of ATF6⍺ in CRT depleted cells in basal conditions compared with parental cells (Fig 3B.1, compare line 1 and line 7; Figure 3B.2, compare line 1 and line 5). These observations suggest that if ATF6⍺ were degraded by ERAD and loss of CRT compromised ERAD functionality, CRT-depleted cells should exhibit increased levels of endogenous ATF6⍺. The fact that endogenous ATF6⍺ levels are slightly reduced in CRT depleted cells does not support a role for CRT in the quality control mechanism for natively unfolded ATF6⍺.

      (10) C618 in ATF6 is located within the BiP binding site and in close proximity of an Nglycosylation site. Is this region of particular importance for CALR binding? 

      It is an interesting point that we have not explored in this study. Consequently, without experimental data, we cannot infer the possible implications of C618 in CRT binding.

      (11) The authors have mutated all the N glycosylation sites at once; they should be mutated one by one and the impact on ATF6 stability evaluated independently of the CALR status. 

      We agree that analysing each N-glycosylation site individually would provide further insight into their contributions to ATF6⍺ stability/functionality. However, given the scope of the paper in its present form we have elected not to addressing this point.

      (12) The relationship between the absence of CALR and IRE1 remains weak. The authors do not exclude the possibility that CALR could have a direct effect on IRE1 itself. This should be either removed or further investigated. 

      We beg to differ. The relationship between the absence of CRT and IRE1 is not weak; loss of CRT in CHO-K1 cells represses IRE1; we conceded readily that the relationship is incompletely understood. ATF6⍺ signalling involves crosstalk with the IRE1 pathway, partly mediated by direct heterodimerisation of N-ATF6⍺ with XBP1s (Yamamoto et al., 2007, 2004). Additionally, recent research has shown that ATF6⍺ activity can repress IRE1 signalling (Walter et al., 2018). Therefore, given that our results indicate that the loss of CRT leads to constitutive activation of ATF6⍺, we suggest that a negative feedback loop in which ATF6⍺ represses IRE1 contributes to the observations made here on the relationship between CRT and IRE1. This does not exclude other aspects to the relationship, a point that is now clarified further in the revised manuscript. 

      Minor point 

      In the introduction on page 3 it is mentioned that loss of ATF6 impairs survival in cellular and animal models, this is not completely true as ATF6a ko in mice has no clear deleterious phenotype and only the double ko ATF6a/b has some dramatic impact.

      We have modified that sentence on the revised manuscript. 

      Reviewer #2 (Public Review): 

      Summary: 

      In this study, the authors set out to use an unbiased CRISPR/Cas9 screen in CHO cells to identify genes encoding proteins that either increase or repress ATF6 signalling in CHO cells. 

      Strengths: 

      The strengths of the paper include the thoroughness of the screens, the use of a novel, double ATF6/IRE1 UPR reporter cell line, and follow-up detailed experiments on two of the findings in the screens, i.e. FURIN and CRT, to test the validity of involvement of each as direct regulators of ATF6 signalling. Additional strengths are the control experiments that validate the ATF6 specificity of the screens, as well as, for CRT, the finding of focus, determining roles for the glycosylation and cysteines in ATF6 as mechanistically involved in how CRT represses ATF6, at least in CHO cells. 

      We thank the reviewer for their favourable assessment of our work.  

      Weaknesses: 

      (1) The weaknesses of the paper are that the authors did not describe why they focused only on the top 100 proteins in each list of ATF6 activators and repressors. 

      We concede that the more genes one studies the better. However, In whole genome CRISPR screens where thousands of hits arise, it is a common practise that researchers prioritise candidates with the greatest significant as those genes are likely to have a more meaningful impact on the phenotype under investigation. Therefore, our decision to focus on the top 100 genes was based on a desire to identify the most prominent and potentially impactful candidates for further analysis, ensuring a manageable scope for in-depth study while maintaining a measure of relevance and significance. Moreover, setting the threshold at 100 hits to perform GEO enrichment analysis is a practise used by previous researchers (PMID: 30323222; PMID: 37251921). In our case, the top 100 hits included the genes with an adjusted P < 0.005. For interested readers, the full ranked list is accessible in the GEO databank (GSE254745) and as supplemental Table S1.

      (2) Additionally, there were a few methodology items missing, such as the nature of where the insertion site in the CHO cell genome of the XBP1::mCherry reporter. Since the authors go to great lengths to insert the other reporter for ATF6 activation in a "safe harbor" location, it leads to questions about whether the XBP1::mCherry reporter insertion is truly innocuous. 

      We appreciate the opportunity to clarify certain aspects of our experimental procedures. In order to generate a double UPR reporter cell line, we employed a previously established the XC45 CHO-K1 clone with an integrated XBP1s::mCherry reporter (Harding et al., 2019; PMID: 31749445). Since the ROSA26 safe harbor locus was available in the XC45 CHO-K1 cell line, we directed integrated the ATF6⍺ reporter there. To provide further clarity, the revised manuscript includes additional details in the Methods section regarding the creation of the XBP1 reporter.

      (3) An additional weakness is that the evidence for the physical interaction between ATF6LD and CRT is not strong, being dependent mainly on a single IP/IB experiment in Figure 4C that comprises only 1 lane on the gel for each of the test cases. Moreover, while that figure suggests that the interaction between CRT and ATF6 is decreased by mutating out the glycosylation sites in the ATF6LD, the BLI experiment in the same figure, 4B, suggests that there are no differences in the affinities of CRT for ATF6LD WT, deltaGly and deltaCys. 

      We would like to highlight that in the IP/IB experiments (see Figure 4C), where wildtype ATF6 (ATF6⍺_LDWT) and GFP-ATF6_LD∆Gly were transiently transfected, GFP-ATF6_LD∆Gly was expressed at lower levels than ATF6⍺_LDWT. This lower expression levels might explain why CRT is more prominently immunoprecipitated with ATF6⍺_LDWT and could account for the differences observed among in vitro and in vivo assays.

      (4) An additional detail is that I found Figure 6A to be difficult to interpret, and that 6B was required in order for me to best evaluate the points being made by the authors in this figure. 

      We have simplified Figure 6A in the revised manuscript to make it more interpretable by focussing the reader’s attention on the transfected population. 

      Overall, I believe that this work will positively impact the field as it provides a list of potential regulators of ATF6 activation and repression that others will be able to use as a launch point for discovering such interactions in cells and tissues or interest beyond CHO cells. However, I agree with the authors that these findings were in CHO cell lines and that it is possible, if not likely, that some of the interactions they found will be cell type/line specific. 

      We accept this point and re-emphasize the qualification that our conclusions cannot be glibly extrapolated to other cell lines.

    1. Author response:

      The following is the authors’ response to the previous reviews

      Public Reviews: 

      Reviewer #1 (Public Review): 

      The goal of the current study was to evaluate the effect of neuronal activity on blood-brain barrier permeability in the healthy brain, and to determine whether changes in BBB dynamics play a role in cortical plasticity. The authors used a variety of well-validated approaches to first demonstrate that limb stimulation increases BBB permeability. Using in vivo-electrophysiology and pharmacological approaches, the authors demonstrate that albumin is sufficient to induce cortical potentiation and that BBB transporters are necessary for stimulus-induced potentiation. The authors include a transcriptional analysis and differential expression of genes associated with plasticity, TGF-beta signaling, and extracellular matrix were observed following stimulation. Overall, the results obtained in rodents are compelling and support the authors' conclusions that neuronal activity modulates the BBB in the healthy brain and that mechanisms downstream of BBB permeability changes play a role in stimulus-evoked plasticity. These findings were further supported with fMRI and BBB permeability measurements performed in healthy human subjects performing a simple sensorimotor task. There is literature to suggest that there are sex differences in BBB dysfunction in pathophysiological conditions and the authors have acknowledged the use of only males as a minor limitation of the study that should be addressed in the future. Future studies should also test whether the upregulation of OAT3 plays a role in cortical plasticity observed following stimulation. Overall, this study provides novel insights into how neurovascular coupling, BBB permeability, and plasticity interact in the healthy brain. 

      Reviewer #2 (Public Review): 

      Summary: 

      This study builds upon previous work that demonstrated that brain injury results in leakage of albumin across the blood brain barrier, resulting in activation of TGF-beta in astrocytes. Consequently, this leads to decreased glutamate uptake, reduced buffering of extracellular potassium and hyperexcitability. This study asks whether such a process can play a physiological role in cortical plasticity. They first show that stimulation of a forelimb for 30 minutes in a rat results in leakage of the blood brain barrier and extravasation of albumin on the contralateral but not ipsilateral cortex. The authors propose that the leakage is dependent upon neuronal excitability and is associated with an enhancement of excitatory transmission. Inhibiting the transport of albumin or the activation of TGF-beta prevents the enhancement of excitatory transmission. In addition, gene expression associated with TGF-beta activation, synaptic plasticity and extracellular matrix are enhanced on the "stimulated" hemisphere. That this may translate to humans is demonstrated by a break down in the blood brain barrier following activation of brain areas through a motor task. 

      Strengths: 

      This study is novel and the results are potentially important as they demonstrate an unexpected break down of the blood brain barrier with physiological activity and this may serve a physiological purpose, affecting synaptic plasticity. 

      The strengths of the study are: 

      (1) The use of an in vivo model with multiple methods to investigate the blood brain barrier response to a forelimb stimulation. 

      (2) The determination of a potential functional role for the observed leakage of the blood brain barrier from both a genetic and electrophysiological view point 

      (3) The demonstration that inhibiting different points in the putative pathway from activation of the cortex to transport of albumin and activation of the TGF-beta pathway, the effect on synaptic enhancement could be prevented.  (4) Preliminary experiments demonstrating a similar observation of activity dependent break down of the blood brain barrier in humans. 

      Weaknesses: 

      The authors adequately addressed most of my points. A few remain: 

      (1) Although the reviewers have addressed the possible effects of anaesthesia on neuro-vascular coupling. They have not mentioned or addressed the possible effects of ketamine (an NMDA receptor antagonist) on synaptic plasticity. Indeed, the low percentage of SEP increase following potentiation (10-20%) could perhaps be explained by partial block of NMDA receptors by ketamine.

      We agree and apologize for this oversight. This important issue is now addressed in the Discussion.

      “Notably, the antagonistic effect of ketamine on NMDA receptors might attenuate the magnitude of SEP potentiation recorded in our experiments (Anis et al., 1983; Salt et al., 1988).”

      (2) The experimental paradigms remain unclear to me. Now, it appears that drugs are applied for 50 minutes and that the stimulation occurs during the "washout period". The more conventional approach would be to have the drug application during the stimulation period to determine if the drugs occlude or enhance the effects of stimulation and then washout the drugs. The problem is that drugs variably washout at different rates depending upon their lipid solubility.

      We agree that the more conventional approach would have been to continue applying the drug throughout the experiment and that differential rates of washout may add variability to our experiments. However, despite this limitation, within each treatment group we found that the SEP response at 50 minutes (immediately after the drug application window) does not differ from SEP response at 80 minutes (after 30 minutes of stimulation and washout) [Figure 3H&G]. This suggests that the drug effects were still present despite terminating drug application and performing potentiation-inducing stimulation. Moreover, our analysis showed that animals within each treatment group (except AP5) had similar SEP responses with little intra-group variability.

      (3) It is still not clear to what extent the experimenters and those doing the analysis were blinded to group. If one or both were blind to group, then please put this in the methods.

      Thank you for this comment. We revised the Methods section to clearly confirm that data was collected and analyzed blindly.  

      Reviewer #3 (Public Review): 

      Summary: 

      This study used prolonged stimulation of a limb to examine possible plasticity in somatosensory evoked potentials induced by the stimulation. They also studied the extent that the blood brain barrier (BBB) was opened by the prolonged stimulation and whether that played a role in the plasticity. They found that there was potentiation of the amplitude and area under the curve of the evoked potential after prolonged stimulation and this was long-lasting (>5 hrs). They also implicated extravasation of serum albumin, caveolae-mediated transcytosis, and TGFb signalling, as well as neuronal activity and upregulation of PSD95. Transcriptomics was done and implicated plasticity related genes in the changes after prolonged stimulation, but not proteins associated with the BBB or inflammation. Next, they address the application to humans using a squeeze ball task. They imaged the brain and suggest that the hand activity led to an increased permeability of the vessels, suggesting modulation of the BBB. 

      Strengths: 

      The strengths of the paper are the novelty of the idea that stimulation of the limb can induce cortical plasticity in a normal condition, and it involves opening of the BBB with albumin entry. In addition, there are many datasets and both rat and human data. 

      Weaknesses: 

      The conclusions are not compelling however because of a lack of explanation of methods.

      In the revised paper, we added a section titled ‘study design’ that presents an overview of the experimental approach.

      The explanation of why prolonged stimulation in the rat was considered relevant to normal conditions should be as clear in the paper as it is in the rebuttal.

      We added a new paragraph to the Discussion section explaining this point as we did in the rebuttal:  

      “Our animal experiments show that a 30 min limb stimulation (at 6Hz and 2mA) increases cross-BBB influx, while a 1 min stimulation (of similar frequency and magnitude) does not. We believe that both types of stimulations fall within the physiological range because our continuous electrophysiological recordings showed no signs of epileptiform or otherwise pathological activity. Moreover, the recorded SEP levels were similar to those reported in previous physiological LTP studies in rats (Eckert & Abraham, 2010; Han et al., 2015; Mégevand et al., 2009) and humans (McGregor et al., 2016). In humans, skill acquisition often involves motor training sessions that last ≥30 minutes (Bengtsson et al., 2005; Classen et al., 1998) and result in physiological plasticity of sensory and motor systems (Classen et al., 1998; Draganski et al., 2004; Sagi et al., 2012). Hence, the experimental task in our human study (30 minutes of repetitive squeezing of an elastic stress-ball) is likely to represent physiological activity, with neuronal activation in primarily motor and sensory areas (Halder et al., 2005). Future human and animal studies are needed to explore the BBB modulating effects of additional stimulation protocols – with varying durations, frequencies, and magnitudes. Such studies may also elucidate the temporal and ultrastructural characteristics that differentiate between physiological and pathological BBB modulation. “

      The authors need to ensure other aspects of the rebuttal are as clear in the paper as in the rebuttal too. 

      Thank you for this comment. This was addressed in the revised paper.

      The only remaining concern that is significant is that it is hard to understand the figures. 

      Thank you for this comment. We revised the figures according to the reviewer’s recommendations. We hope that these changes increase the legibility of the figures. 

      Reviewer #3 (Recommendations For The Authors): 

      The manuscript is improved but there are still suggestions that do not appear to have been addressed. More experiments are not involved in addressing these concerns but one wants the paper to be clarified in terms of what was done. 

      Figures. Please use arrows to point to the effect that the reader should see. Please note what the main point is. 

      Major concerns: 

      Please add explanations, exact p values, and other revisions in the rebuttal to the paper. 

      Rebuttal explanations were added to the paper and p values appear in figure legends.

      Fig 1d shows a seizure-like event which the authors don't think is a seizure because it lacks a depolarization ship. This explanation is not convincing because a LFP would not necessarily show a depolarization ship. Another argument of a discussion of the event as a seizure is warranted. Note that expanding the trace might also show it is unlike a seizure. Regarding the idea that 6Hz 2 mA stimuli for 30 min are physiological, the authors make three arguments which are not clear. First, no epileptiform activity was found, but in Fig. 1 it looks like a seizure occurred. Second, memory and skill acquisition in humans open involve a similar training duration - but what about 6Hz 2 mA?

      Rats are known to rhythmically move their whiskers at frequencies ranging between 5 and 15 Hz (Mégevand et al., 2009). We agree that there is no clear way to justify the similarity between the experimental design in humans and rats. However, we believe that both paradigms (paw stimulation in rats and ball squeeze in humans) represent non-pathological input that we found to modulate barrier permeability. This argument was added to the discussion of the paper:

      “We believe that both types of stimulations fall within the physiological range because in rats, activity between 515 Hz represents physiological rhythmic whisker movement during environment exploration (Mégevand et al., 2009).” 

      Seizures are typically induced in rats via direct tetanic stimulation of the brain (at 50 Hz and 0.3-2.5mA) or maximal electroshock test to the cornea (at 50 Hz and 150 mA) (Swinyard et al., 1952). We, therefore, assert that the activity we observe represents physiological responses and not seizures. This argument is beyond the scope of the current paper. 

      Please note a limitation is that the high level of serum albumin is unlikely to be physiological but may not have been as high in the animal because of the low diffusion rate and degradation (please add the refs in the rebuttal). 

      Thank you, we added the following to the Results section: 

      “The relatively high concentration of albumin was chosen to account for factors that lower its effective tissue concentration such as its low diffusion rate and its likelihood to encounter a degradation site or a cross-BBB efflux transporter (Tao & Nicholson, 1996; Zhang & Pardridge, 2001).”

      Fig. 1. 

      Please consider a box in b to show where the expanded traces in the lower row came from. 

      Thank you for the suggestion. We added lines indicating where the trace excerpts were taken from.

      c. Please use arrows to point to the parts that the authors want the reader to note. In the legend, explain what t is, and delta HbT.

      Thank you. We implemented this suggestion.

      d. It is not clear what the double-sided arrows are meant to show compared to the arrow without two sides. 

      We replaced the two-headed arrow with two single ones.

      e. Please explain what the upward lines at the top signify. What does the red asterisk mean? 

      Thank you. We implemented this suggestion.

      f. Is the reader supposed to note the yellow area? Please make it with an arrow or circle if so. 

      Thank you, we added a white circle to mark the area of tracer accumulation.

      g. Please explain what the permeability index is or reference the part of the paper that does. 

      Further to this suggestion, we added a refence to the appropriate methods section to the legend.

      h. Please use arrows to point to the area of interest. 

      Thank you. We implemented this suggestion.

      m-n. Please mark areas of interest with arrows.  m. the top right two images are unclear. I suggest making them say ipsi inset and contra inset instead of using asterisks. 

      Thank you. We added the ipsi and contra labels to panels in m. The images in panel n represent a phenomenon with no particular region of interest, but rather peri-vascular tracer accumulation along the entire depicted blood vessel. We clarified that panel n represents a separate experiment than panel m: “n. In an animal injected with both EB and NaFlu post stimulation, fluorescence imaging shows extravascular accumulation of both tracers along a cortical small vessel in the stimulated hemisphere.”

      Figure 2. 

      (2) a. Middle. What are the vertical lines at the top? The rebuttal states that was explained in the revised legends but I don't see it. 

      Our apologies. We now included an explanation that “an excerpt of the stimulation trace is shown above the middle LFP trace”.

      c and d are very different field potentials in shape and therefore hard to compare. The rebuttal addresses this but the explanation is not in the revised text. 

      We agree that there is variability in SEP responses between animals. We now added a statement acknowledging this in the methods section: “To overcome potential variability in SEP morphology between animals (Mégevand et al., 2009), each animal’s plasticity measures (max amplitude and AUC of post stimulation SEP) were compared to the same measures at baseline.” 

      In d, it is not clear there is potentiation because the traces are not aligned. 

      All panels depicting SEP traces represent raw data with no alignment. The shift observed in panel d exemplifies why we compare post-stimulation parameters of max amplitude and area under curve to baseline in each animal. 

      Exact P values are said to have been added in the rebuttal but they were not. 

      Exact P values appear in Figure legends.

      (3) b. Use arrows to mark the area of interest. 

      Thank you. We added a white circle to mark the area of tracer accumulation similar to Figure 1f.

      d. Why is there an oscillation superimposed on all traces except CNQX? 

      We agree that this is an interesting question. Future studies should determine the source of this SEP pattern.   

      (4) What does the line and the number 2 mean? How were data normalized? What was counted? What area of cortex?

      The number 2 refers to the scale bar line, meaning a log fold change of 2 reflects the size of the scale bar line. 

      The plot shows the log fold change against the mean count of each gene in the contralateral somatosensory cortex between 1 and 24 hours after stimulation.

      The x axis title was changed to “mean expression” and the legend was modified to:

      “Scatter plot of gene expression from RNA-seq in the contralateral somatosensory cortex 24 vs. 1 h after 30 min stimulation. The y axis represents the log fold change, and the x axis represents the mean expression levels (see methods, RNA Sequencing & Bioinformatics). Blue dots indicate statistically significant differentially expressed genes (DEGs) by Wald Test (n=8 rats per group).”

      How were the pericytes, smooth muscle cells, ,etc. distinguished? 

      This was explained under Methods->RNA Sequencing & Bioinformatics: “Analysis of cell-specific and vascular zonation genes was performed as described (Vanlandewijck et al., 2018), using the database provided in (http://betsholtzlab.org/VascularSingleCells/database.html).”

      What were the chi square statistics? If there were cells used instead of rats, please justify. 

      Thank you. The legend was expanded to include the following:

      “The contralateral somatosensory cortex was found to have a significantly higher number of DEGs related to synaptic plasticity, than the ipsilateral side (***p<0.001, Chi-square).”     

      (5) b. what do the icons mean? 

      We agree that the icons were confusing. We simplified this panel to just show when participants were asked to squeeze the ball (black icon). This explanation was added to the Figure legend.

      Abbreviations? 

      Abbreviations of MRI protocols were added to the figure legend for clarity.

      In c-e what are the units of measure? Fold-change? 

      The units represent t-statistics values for each voxel. The label ‘t-statistic’ was added to the figure.  

      What are the white Iines, + and - signs? 

      The white lines point to voxels of highest activation (t-statistic). This was added to the legend.

      And these are not +/- signs these are voxels with significant activation which only appear similar.

      f. Please explain f and g for clarity. 

      Thank you. The explanation was modified for added clarity.

      Supplemental Fig. 4. 

      Original question: If ipsilateral and contralateral showed many changes why do the authors think the effects were only contralateral? 

      The authors replied: Our gene analysis was designed to complement our in vivo and histological findings, by assessing the magnitude of change in differentially expressed genes (DEGs). This analysis showed that: (1) the hemisphere contralateral to the stimulus has significantly more DEGs than the ipsilateral hemisphere; and (2) the DEGs were related to synaptic plasticity and TGF-b signaling. These findings strengthen the hypothesis raised by our in vivo and histological experiments. 

      Could the authors clarify the answer to the question in the text? 

      Thank you. This section was added to the Discussion. 

      Papers referenced in this letter:

      Anis, N. A., Berry, S. C., Burton, N. R., & Lodge, D. (1983). The dissociative anaesthetics, ketamine and phencyclidine, selectively reduce excitation of central mammalian neurones by N-methyl-aspartate. British Journal of Pharmacology, 79(2), 565–575. hQps://doi.org/10.1111/j.1476-5381.1983.tb11031.x

      Bengtsson, S. L., Nagy, Z., Skare, S., Forsman, L., Forssberg, H., & Ullén, F. (2005). Extensive piano practicing has regionally specific effects on white matter development. Nature Neuroscience, 8(9), 1148–1150. hQps://doi.org/10.1038/nn1516

      Classen, J., Liepert, J., Wise, S. P., Hallett, M., & Cohen, L. G. (1998). Rapid plasticity of human cortical movement representation induced by practice. Journal of Neurophysiology, 79(2), 1117–1123. hQps://doi.org/10.1152/JN.1998.79.2.1117/ASSET/IMAGES/LARGE/JNP.JA47F4.JPEG

      Draganski, B., Gaser, C., Busch, V., Schuierer, G., Bogdahn, U., & May, A. (2004). Changes in grey matter induced by training. Nature, 427(6972), 311–312. hQps://doi.org/10.1038/427311a

      Eckert, M. J., & Abraham, W. C. (2010). Physiological effects of enriched environment exposure and LTP induction in the hippocampus in vivo do not transfer faithfully to in vitro slices. Learning and Memory, 17(10), 480–484. hQps://doi.org/10.1101/lm.1822610

      Halder, P., Sterr, A., Brem, S., Bucher, K., Kollias, S., & Brandeis, D. (2005). Electrophysiological evidence for cortical plasticity with movement repetition. European Journal of Neuroscience, 21(8), 2271–2277. hQps://doi.org/10.1111/J.1460-9568.2005.04045.X

      Han, Y., Huang, M. De, Sun, M. L., Duan, S., & Yu, Y. Q. (2015). Long-term synaptic plasticity in rat barrel cortex. Cerebral Cortex, 25(9), 2741–2751. hQps://doi.org/10.1093/cercor/bhu071

      McGregor, H. R., Cashaback, J. G. A., & Gribble, P. L. (2016). Functional Plasticity in Somatosensory Cortex Supports Motor Learning by Observing. Current Biology, 26(7), 921–927. hQps://doi.org/10.1016/j.cub.2016.01.064

      Mégevand, P., Troncoso, E., Quairiaux, C., Muller, D., Michel, C. M., & Kiss, J. Z. (2009). Long-term plasticity in mouse sensorimotor circuits after rhythmic whisker stimulation. Journal of Neuroscience, 29(16), 5326– 5335. hQps://doi.org/10.1523/JNEUROSCI.5965-08.2009

      Sagi, Y., Tavor, I., HofsteQer, S., Tzur-Moryosef, S., Blumenfeld-Katzir, T., & Assaf, Y. (2012). Learning in the Fast Lane: New Insights into Neuroplasticity. Neuron, 73(6), 1195–1203. hQps://doi.org/10.1016/j.neuron.2012.01.025

      Salt, T. E., Wilson, D. G., & Prasad, S. K. (1988). Antagonism of N-methylaspartate and synapBc responses of neurones in the rat ventrobasal thalamus by ketamine and MK-801. British Journal of Pharmacology,

      94(2), 443–448. hQps://doi.org/10.1111/j.1476-5381.1988.tb11546.x

      Swinyard, E. A., Brown, W. C., & Goodman, L. S. (1952). Comparative assays of antiepileptic drugs in mice and rats. The Journal of Pharmacology and Experimental Therapeutics, 106(3), 319–330. hQp://jpet.aspetjournals.org/content/106/3/319.abstract

      Tao, L., & Nicholson, C. (1996). Diffusion of albumins in rat cortical slices and relevance to volume transmission. Neuroscience, 75(3), 839–847. hQps://doi.org/10.1016/0306-4522(96)00303-X

      Vanlandewijck, M., He, L., Mäe, M. A., Andrae, J., Ando, K., Del Gaudio, F., Nahar, K., Lebouvier, T., Laviña, B.,

      Gouveia, L., Sun, Y., Raschperger, E., Räsänen, M., Zarb, Y., Mochizuki, N., Keller, A., Lendahl, U., &

      Betsholtz, C. (2018). A molecular atlas of cell types and zonation in the brain vasculature. Nature, 554(7693), 475–480. hQps://doi.org/10.1038/nature25739

      Zhang, Y., & Pardridge, W. M. (2001). Mediated efflux of IgG molecules from brain to blood across the blood– brain barrier. Journal of Neuroimmunology, 114(1–2), 168–172. hQps://doi.org/10.1016/S01655728(01)00242-9

    2. eLife assessment

      This study builds upon previous work which demonstrated that brain injury results in the entry of a protein called albumin into the brain which then causes diverse effects. The present study shows that prolonged stimulation of a forelimb in a rat leads to albumin entry, and is associated with effects that suggest plasticity is enhanced in the stimulated side of the brain. The strength of evidence was convincing and results are important because they suggest a previously-considered pathological process may be relevant to the normal brain and have benefits.

    3. Reviewer #3 (Public Review):

      Summary:

      This study used prolonged stimulation of a limb to examine possible plasticity in somatosensory evoked potentials induced by the stimulation. They also studied the extent that the blood brain barrier (BBB) was opened by the prolonged stimulation and whether that played a role in the plasticity. They found that there was potentiation of the amplitude and area under the curve of the evoked potential after prolonged stimulation and this was long-lasting (>5 hrs). They also implicated extravasation of serum albumin, caveolae-mediated transcytosis, and TGFb signalling, as well as neuronal activity and upregulation of PSD95. Transcriptomics was done and implicated plasticity related genes in the changes after prolonged stimulation, but not proteins associated with the BBB or inflammation. Next, they address the application to humans using a squeeze ball task. They imaged the brain and suggested that the hand activity led to an increased permeability of the vessels, suggesting modulation of the BBB.

      Strengths:

      The strengths of the paper are the novelty of the idea that stimulation of the limb can induce cortical plasticity in a normal condition, and it involves the opening of the BBB with albumin entry. In addition, there are many datasets, both rat and human data.

      Weaknesses:

      The explanation of why prolonged stimulation in the rat was considered relevant to normal conditions is still somewhat weak. The authors argue that the stimulation frequency they used is similar to rhythmic whisker movement. That is a good argument. However, the intensity they used, 2 mA is in the range they say can elicit a seizure if stimulation is 50 Hz. So that weakens the argument.

      The authors made a lot of the requested changes but some questions were not addressed or the explanations were so brief that the confusion remained. Please go over the revisions again and make sure sentences are complete, jargon is explained, and arguments/justifications are clear. It will help the reader greatly.

      The authors responded to the previous comments of Reviewer 2 regarding experimental design and variability of washout periods. It would be useful to incorporate the response into the paper so the readers know why the authors think the variability was not an important factor in the results.

      Comments on the revised version:

      The manuscript is improved.

    1. eLife assessment

      This study provides an important cell type atlas of the gill of the mussel Gigantidas platifrons using a single nucleus RNA-seq dataset, a resource for the community of scientists studying deep sea physiology and metabolism and intracellular host-symbiont relationships. The evidence supporting the conclusions is convincing with high-quality single-nucleus RNA sequencing and transplant experiments. This work will be of broad relevance for scientists interested in host-symbiont relationships across ecosystems.

    2. Reviewer #1 (Public Review):

      Wang, He et al have constructed a comprehensive single nucleus atlas for the gills of the deep sea Bathymodioline mussels, which possess intracellular symbionts that provide a key source of carbon and allow them to live in these extreme environments. They provide annotations of the different cell states within the gills, shedding light on how multiple cell types cooperate to give rise to the emergent functions of the composite tissues and the gills as a whole. They pay special attention to characterizing the bacteriocyte cell populations and identifying sets of genes that may play a role in their interaction with the symbiotes.

      Wang, He et al sample mussels from 3 different environments: animals from their native methane rich environment, animals transplanted to a methane-poor environment to induce starvation and animals that have been starved in the methane-poor environment and then moved back to the methane-rich environment. They demonstrated that starvation had the biggest impact on bacteriocyte transcriptomes. They hypothesize that the up-regulation of genes associated with lysosomal digestion leads to the digestion of the intracellular symbiont during starvation, while the non-starved and reacclimated groups more readily harvest the nutrients from symbiotes without destroying them. Further work exploring the differences in symbiote populations between ecological conditions will further elucidate the dynamic relationship between host and symbiote. This will help disentangle specific changes in transcriptomic state that are due to their changing interactions with the symbiotes from changes associated with other environmental factors.

      This paper makes available a high quality dataset that is of interest to many disciplines of biology. The unique qualities of this non-model organism and collection of conditions sampled make it of special interest to those studying deep sea adaptation, the impact of environmental perturbation on Bathymodioline mussels populations, and intracellular symbiotes. The authors also use a diverse array of tools to explore and validate their data.

    3. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews:

      Reviewer #1 (Public Review): 

      Wang, He et al have constructed comprehensive single nucleus atlas for the gills of the deep sea Bathymodioline mussels, which possess intracellular symbionts that provide a key source of carbon and allow them to live in these extreme environments. They provide annotations of the different cell states within the gills, shedding light on how multiple cell types cooperate to give rise to the emergent functions of the composite tissues and the gills as a whole. They pay special attention to characterizing the bacteriocyte cell populations and identifying sets of genes that may play a role in their interaction with the symbiotes. 

      Wang, He et al sample mussels from 3 different environments: animals from their native methane rich environment, animals transplanted to a methane-poor environment to induce starvation and animals that have been starved in the methane-poor environment and then moved back to the methane-rich environment. They demonstrated that starvation had the biggest impact on bacteriocyte transcriptomes. They hypothesize that the up-regulation of genes associated with lysosomal digestion leads to the digestion of the intracellular symbiont during starvation, while the non-starved and reacclimated groups more readily harvest the nutrients from symbiotes without destroying them. Further work exploring the differences in symbiote populations between ecological conditions will further elucidate the dynamic relationship between host and symbiote. This will help disentangle specific changes in transcriptomic state that are due to their changing interactions with the symbiotes from changes associated with other environmental factors. 

      This paper makes available a high quality dataset that is of interest to many disciplines of biology. The unique qualities of this non-model organism and collection of conditions sampled make it of special interest to those studying deep sea adaptation, the impact of environmental perturbation on Bathymodioline mussels populations, and intracellular symbiotes. The authors also use a diverse array of tools to explore and validate their data. 

      Reviewer #2 (Public Review): 

      Wang, He et al. shed insight into the molecular mechanisms of deep-sea chemosymbiosis at the single-cell level. They do so by producing a comprehensive cell atlas of the gill of Gigantidas platifrons, a chemosymbiotic mussel that dominates the deep-sea ecosystem. They uncover novel cell types and find that the gene expression of bacteriocytes, the symbiont-hosting cells, supports two hypotheses of host-symbiont interactions: the "farming" pathway, where symbionts are directly digested, and the "milking" pathway, where nutrients released by the symbionts are used by the host. They perform an in situ transplantation experiment in the deep sea and reveal transitional changes in gene expression that support a model where starvation stress induces bacteriocytes to "farm" their symbionts, while recovery leads to the restoration of the "farming" and "milking" pathways. 

      A major strength of this study includes the successful application of advanced single nucleus techniques to a non-model, deep sea organism that remains challenging to sample. I also applaud the authors for performing an in situ transplantation experiment in a deep sea environment. From gene expression profiles, the authors deftly provide a rich functional description of G. platifrons cell types that is well-contextualized within the unique biology of chemosymbiosis. These findings offer significant insight into the molecular mechanisms of deep-sea host-symbiont ecology, and will serve as a valuable resource for future studies into the striking biology of G. platifrons. 

      The authors' conclusions are generally well-supported by their results. However, I recognize that the difficulty of obtaining deep-sea specimens may have impacted experimental design and no replicates were sampled. 

      It is notable that the Fanmao cells were much more sparsely sampled. It appears that fewer cells were sequenced, resulting in the Starvation and Reconstitution conditions having 2-3x more cells after doublet filtering. These discrepancies also are reflected in the proportion of cells that survived QC, suggesting a distinction in quality or approach. However, the authors provide clear and sufficient evidence via bootstrapping that batch effects between the three samples are negligible. While batch effect does not appear to have affected gene expression profiles, the proportion of cell types may remain sensitive to sampling techniques, and thus interpretation of Fig. S12 must be approached with caution. 

      Reviewer #3 (Public Review): 

      Wang et al. explored the unique biology of the deep-sea mussel Gigantidas platifrons to understand fundamental principles of animal-symbiont relationships. They used single-nucleus RNA sequencing and validation and visualization of many of the important cellular and molecular players that allow these organisms to survive in the deep-sea. They demonstrate that a diversity of cell types that support the structure and function of the gill including bacteriocytes, specialized epithelial cells that host sulfur-oxidizing or methane-oxidizing symbionts as well as a suite of other cell types including supportive cells, ciliary, and smooth muscle cells. By performing experiments of transplanting mussels from one habitat which is rich in methane to methane-limited environments, the authors showed that starved mussels may consume endosymbionts versus in methane-rich environments upregulated genes involved in glutamate synthesis. These data add to the growing body of literature that organisms control their endosymbionts in response to environmental change. 

      The conclusions of the data are well supported. The authors adapted a technique that would have been technically impossible in their field environment by preserving the tissue and then performing nuclear isolation after the fact. The use of single-nucleus sequencing opens the possibility of new cellular and molecular biology that is not possible to study in the field. Additionally, the in-situ data (both WISH and FISH) are high-quality and easy to interpret. The use of cell-type-specific markers along with a symbiont-specific probe was effective. Finally, the SEM and TEM were used convincingly for specific purposes in the case of showing the cilia that may support water movement. 

      The one particular area for future exploration surrounds the concept of a proliferative progenitor population within the gills. The authors recover molecular markers for these putative populations and additional future work will uncover if these are indeed proliferative cells contribute to symbiont colonization. 

      Overall the significance of this work is identifying the relationship between symbionts and bacteriocytes and how these host bacteriocytes modulate their gene expression in response to environmental change. It will be interesting to see how similar or different these data are across animal phyla. For instance, the work of symbiosis in cnidarians may converge on similar principles of there may be independent ways in which organisms have been able to solve these problems. 

      We extend our sincere gratitude to all the reviewers for their positive comments and kind words. We highly value the substantial efforts they made in helping us improve and enhance our manuscript. Additionally, we appreciate the reviewers for pointing out the limitations of our current study, which will guide us in improving our future researches.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors): 

      This study system is so interesting and this is a truly unique and exciting dataset. Most of my suggestions are aimed at improving readability and making it more accessible for a broader audience, since I predict many fields will find it interesting. 

      Line 60: which species of mussel? Is this the same one? 

      We appreciate the comments from the reviewer. The reference here is to deep-sea bathymodiolin mussels, which, in most cases, possess enlarged gill filaments that accommodate symbionts.

      Line 237-230: citation of previous findings missing 

      We appreciate the comments from the reviewer. After carefully reviewing these paragraphs, we believe that all the previous findings have now been properly cited.

      Line 256: it might be a good idea to give a brief description of what slingshot analysis is here 

      We appreciate the comments from the reviewer. We have revise the corresponding part of our manuscript to make it clear.

      This parts of manscript now reads: “We performed Slingshot analysis, which uses a cluster-based minimum spanning tree (MST) and a smoothed principal curve to determine the developmental path of cell clusters. The re-sult shows that the PEBZCs might be the origin of all gill epithelial cells, including the other two proliferation cells (VEPC and DEPC) and bacteriocytes (Supplementary Fig. S6).” Line 203-207 of the revised manscript.

      Line 289: Wording is a bit confusing- what is meant by morphological analysis?

      We acknowledge that our wording might be a bit confusing here. We are referring to the TEM ultrastructural analysis. Therefore, we have changed “morphological analysis” to “ultrastructural analysis.” Line231 in the revised manuscript.

      Line 351-354: how did you calculate distances? How many dimensions were used? 

      We calculated the centroid coordinates for each cell type in each state on the 2-dimensional UMAP plot (Fig. 6A). Then, for each cell type, we determined the Euclidean distance between the centroid coordinates of each pair of states. We have revised the manuscript with this more detailed description. Line 292-295 of revised manuscript.

      Line 462: identify -> identified 

      We apologize for our mistake and appreciate the reviewer’s kind assistance with proofreading. The typo has been corrected in the new version. Line396 of the revised manscript.

      Line 509: what does the size of the dot represent? 

      In this context, the color and intensity of each dot represent a specific gene’s expression level in the single-cell cluster. The dot size is universal and therefore does not convey a specific meaning.

      Fig 3A: What is the blue cluster highlighted? 

      We apologize for our mistake. The label for the teal box was missed. We have corrected our mistake in the revised manuscript.

      Fig 3K: Wording in key is confusing. 

      We have modified our description of Fiugre 3K in the figure legneds. Now it reads: “Schematic of water flow agitated by different ciliary cell types. The color of arrowheads corresponds to water flow potentially influenced by specific types of cilia, as indicated by their color code in Figure 3A.” Line462-464 in the revised manscript.

      Fig 5B: which population of mussels was used to take these images? 

      These mussels from “Fanmao” (methane rich) site were used to take these images. We have revised our material and methods to make it clear. Line602-603 of the revised manuscript.

      Fig 5E,5G,5H: panels not referenced in text 

      We apologize for our mistake and appreciate the reviewer’s thorough reading. This error has been corrected in the new version of the manuscript. Line233 of the revised manuscript.

      Reviewer #2 (Recommendations For The Authors): 

      Minor comments: 

      Fig. 3A - the teal box in the legend lacks a label 

      We apologize for our mistake. The label for the teal box was missed. We have corrected our mistake in the

      Reviewer #3 (Recommendations For The Authors): 

      My enthusiasm for the manuscript remains high and I appreciate the authors care in responding to the various reviewer questions and concerns. 

      In regards to the cell proliferation results, I have modified my public review and look forward to your future work in this area. The data for both pHistone H3 and anti PCNA are compelling! 

      One typo I did catch occurs on line 520. I believe you meant to say "outer" not "otter." 

      We apologize for our mistake and appreciate the reviewer’s kind assistance with proofreading. The typo has been corrected in the new version.

    4. Reviewer #2 (Public Review):

      Wang, He et al. shed insight into the molecular mechanisms of deep-sea chemosymbiosis at the single-cell level. They do so by producing a comprehensive cell atlas of the gill of Gigantidas platifrons, a chemosymbiotic mussel that dominates the deep-sea ecosystem. They uncover novel cell types and find that the gene expression of bacteriocytes, the symbiont-hosting cells, supports two hypotheses of host-symbiont interactions: the "farming" pathway, where symbionts are directly digested, and the "milking" pathway, where nutrients released by the symbionts are used by the host. They perform an in situ transplantation experiment in the deep sea and reveal transitional changes in gene expression that support a model where starvation stress induces bacteriocytes to "farm" their symbionts, while recovery leads to the restoration of the "farming" and "milking" pathways.

      A major strength of this study includes the successful application of advanced single nucleus techniques to a non-model, deep sea organism that remains challenging to sample. I also applaud the authors for performing an in situ transplantation experiment in a deep sea environment. From gene expression profiles, the authors deftly provide a rich functional description of G. platifrons cell types that is well-contextualized within the unique biology of chemosymbiosis. These findings offer significant insight into the molecular mechanisms of deep-sea host-symbiont ecology, and will serve as a valuable resource for future studies into the striking biology of G. platifrons.

      The authors' conclusions are generally well-supported by their results. However, I recognize that the difficulty of obtaining deep-sea specimens may have impacted experimental design and no replicates were sampled.

      It is notable that the Fanmao cells were much more sparsely sampled. It appears that fewer cells were sequenced, resulting in the Starvation and Reconstitution conditions having 2-3x more cells after doublet filtering. These discrepancies also are reflected in the proportion of cells that survived QC, suggesting a distinction in quality or approach. However, the authors provide clear and sufficient evidence via bootstrapping that batch effects between the three samples are negligible. While batch effect does not appear to have affected gene expression profiles, the proportion of cell types may remain sensitive to sampling techniques, and thus interpretation of Fig. S12 must be approached with caution.

    5. Reviewer #3 (Public Review):

      Wang et al. explored the unique biology of the deep-sea mussel Gigantidas platifrons to understand fundamental principles of animal-symbiont relationships. They used single-nucleus RNA sequencing and validation and visualization of many of the important cellular and molecular players that allow these organisms to survive in the deep-sea. They demonstrate that a diversity of cell types that support the structure and function of the gill including bacteriocytes, specialized epithelial cells that host sulfur-oxidizing or methane-oxidizing symbionts as well as a suite of other cell types including supportive cells, ciliary, and smooth muscle cells. By performing experiments of transplanting mussels from one habitat which is rich in methane to methane-limited environments, the authors showed that starved mussels may consume endosymbionts versus in methane-rich environments upregulated genes involved in glutamate synthesis. These data add to the growing body of literature that organisms control their endosymbionts in response to environmental change.

      The conclusions of the data are well supported. The authors adapted a technique that would have been technically impossible in their field environment by preserving the tissue and then performing nuclear isolation after the fact. The use of single-nucleus sequencing opens the possibility of new cellular and molecular biology that is not possible to study in the field. Additionally, the in-situ data (both WISH and FISH) are high-quality and easy to interpret. The use of cell-type-specific markers along with a symbiont-specific probe was effective. Finally, the SEM and TEM were used convincingly for specific purposes in the case of showing the cilia that may support water movement.

      The one particular area for future exploration surrounds the concept of a proliferative progenitor population within the gills. The authors recover molecular markers for these putative populations and additional future work will uncover if these are indeed proliferative cells that contribute to symbiont colonization.

      Overall the significance of this work is identifying the relationship between symbionts and bacteriocytes and how these host bacteriocytes modulate their gene expression in response to environmental change. It will be interesting to see how similar or different these data are across animal phyla. For instance, the work of symbiosis in cnidarians may converge on similar principles of there may be independent ways in which organisms have been able to solve these problems.

    1. eLife assessment

      This manuscript provides important information on the calcification process, especially the properties and formation of freshly formed tests (the foraminiferan shells), in the miliolid foraminiferan species Pseudolachlanella eburnea. The evidence from the high-quality SEM images is convincing although the fluorescence images only provide indirect support for the calcification process.

    2. Reviewer #1 (Public Review):

      Summary:

      The manuscript by Dubicka and co-workers on calcification in miliolid foraminifera presents an interesting piece of work. The study uses confocal and electron microscopy to show that the traditional picture of calcification in porcelaneous foraminifera is incorrect.

      Strengths:<br /> The authors present high-quality images and an original approach to a relatively solid (so I thought) model of calcification.

      Weaknesses:

      There are several major shortcomings. Despite the interesting subject and the wonderful images, the conclusions of this manuscript are simply not supported at all by the results. The fluorescent images may not have any relation to the process of calcification and should therefore not be part of this manuscript. The SEM images, however, do point to an outdated idea of miliolid calcification. I think the manuscript would be much stronger with the focus on the SEM images and with the speculation of the physiological processes greatly reduced.

      Comments on revised version:

      I continue to disagree. As the authors acknowledge: 'may be a hint indicating ACC...', but it may also be something else. This is really something else than showing ACC is involved in foraminiferal calcification. I still think the reasoning is shaky and below, I will clarify why the fluorescence may well not be related to ACC and in fact, some or even most of the vesicles may not play the role that the authors suggest. Even if they do, the conclusions are not supported by the data presented here. Unfortunately, I found some of the other answers to my question not satisfactory either.

    3. Reviewer #2 (Public Review):

      Summary:

      Dubicka et al. in their paper entitled " Biocalcification in porcelaneous foraminifera" suggest that in contrast to the traditionally claimed two different modes of test calcification by rotallid and porcelaneous miliolid formaminifera, both groups produce calcareous tests via the intravesicular mineral precursors (Mg-rich amorphous calcium carbonate). These precursors are proposed to be supplied by endocytosed seawater and deposited in situ as mesocrystals formed at the site of new wall formation within the organic matrix. The authors did not observe the calcification of the needles within the transported vesicles, which challenges the previous model of miliolid mineralization. Although the authors argue that these two groups of foraminifera utilize the same calcification mechanism, they also suggest that these calcification pathways evolved independently in the Paleozoic.

      Comments on the revised version

      In my reply to the author's rebuttal letter, I will focus on one key point. The main observation supporting the author's conclusion, as expressed in the abstract, is:

      "We found that both groups [i.e., rotaliids and miliolids, the latter documented in the reviewed paper] produced calcareous shells via the intravesicular formation of unstable mineral precursors (Mg-rich amorphous calcium carbonates) supplied by endocytosed seawater and deposited at the site of new wall formation within the organic matrix. Precipitation of high-Mg calcitic mesocrystals took place in situ and formed a dense, chaotic meshwork of needle-like crystallites."

      In my review, I pointed out that there is no support for the existence of an intracellular, vesicular intermediate amorphous phase.

      The authors replied:

      "We used laser line 405 nm and multiphoton excitation to detect ACCs. These wavelengths (partly) permeate the shell to excite ACCs autofluorescence. The autofluorescence of the shells is present as well but not clearly visible in movie S4 as the fluorescence of ACCs is stronger. This may be related to the plane/section of the cell which is shown. The laser permeates the shell above the ACCs (short distance) but to excite the shell CaCO3 around foraminifera in the same three-dimensional section where ACCs are shown, the light must pass a thick CaCO3 area due to the three-dimensional structure of the foraminiferan shell. Therefore, the laser light intensity is reduced. In a revised version, a movie/image with reduced threshold is shown."

      This reply does not address the reviewer's concerns. Detection of ACC with 405 nm excitation is not sufficient; many organic components can fluoresce under violet light excitation. For example, Delvene et al. (2002) (https://doi.org/10.18261/let.55.4.7) showed that "the Pleistocene and Jurassic microborings emit in the blue-yellow spectral region (420-600 nm) with a laser excitation of 405 nm, which coincides with the emission due to NADPH [nicotinamide adenine dinucleotide], FAD [flavin adenine dinucleotide], and riboflavin pigments characteristic of some cyanobacteria." Traditionally, in geological or biogenic calcium carbonate samples, Raman spectroscopic characterization of ACC and its magnesium content can be used (e.g., Wang, D., Hamm, L. M., Bodnar, R. J. & Dove, P. M. Raman spectroscopic characterization of the magnesium content in amorphous calcium carbonates. J. Raman Spectrosc. 43, 543-548 (2012); Perrin, J. et al. Raman characterization of synthetic magnesian calcites. Am. Mineral. 101, 2525-2538 (2016)). However, in biological, living-cell systems, Mehta et al. (2022) (doi: 10.1016/j.saa.2022.121262) successfully used FTIR spectroscopy to identify ACC by two characteristic FTIR vibrations at ca. 860 cm-1 and ca. 306 cm-1. Other methods such as STXM analyses at the C K-edge (Monteil et al. 2021, doi: 10.1038/s41396-020-00747-3) are also available. Because the core of the authors' interpretation (i.e., detection of ACC in vesicles) is not supported by hard evidence, the claim that the study represents a "paradigm shift" is far-fetched and the whole model is based on speculations. If the authors are able to unequivocally confirm the presence of ACC within the vesicles and its subsequent transformation into calcitic needles, the other problems noted in the paper will be relatively trivial.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      The manuscript by Dubicka and co-workers on calcification in miliolid foraminifera presents an interesting piece of work. The study uses confocal and electron microscopy to show that the traditional picture of calcification in porcelaneous foraminifera is incorrect.

      Strengths:

      The authors present high-quality images and an original approach to a relatively solid (so I thought) model of calcification.

      Weaknesses:

      There are several major shortcomings. Despite the interesting subject and the wonderful images, the conclusions of this manuscript are simply not supported at all by the results. The fluorescent images may not have any relation to the process of calcification and should therefore not be part of this manuscript. The SEM images, however, do point to an outdated idea of miliolid calcification. I think the manuscript would be much stronger with the focus on the SEM images and with the speculation of the physiological processes greatly reduced.

      We agree that fluorescence studies presented in the paper are not an unequivocal proof by itself for calcification model utilised by studied Miliolida species. However, fluorescence data combined with SEM studies, especially overlap of the elements that show autofluorescence upon excitation at 405 nm (emission 420–480 nm) and acidic vesicles marked by p_H-_sensitive LysoGlow84, may be a hint indicating ACC-bearing vesicles.

      We will tone down the the physiological interpretation based on fluorescence studies in the revised version of the manuscript.

      Nevertheless, we think that our fluorescent life-imaging experiments provides important observations in miliolida, which is scarce in the existing literature, and therefore are worth being presented as they might be very helpful in better understanding of full calcification model in the future.

      Reviewer #2 (Public Review):

      Summary:

      Dubicka et al. in their paper entitled " Biocalcification in porcelaneous foraminifera" suggest that in contrast to the traditionally claimed two different modes of test calcification by rotallid and porcelaneous miliolid formaminifera, both groups produce calcareous tests via the intravesicular mineral precursors (Mg-rich amorphous calcium carbonate). These precursors are proposed to be supplied by endocytosed seawater and deposited in situ as mesocrystals formed at the site of new wall formation within the organic matrix. The authors did not observe the calcification of the needles within the transported vesicles, which challenges the previous model of miliolid mineralization. Although the authors argue that these two groups of foraminifera utilize the same calcification mechanism, they also suggest that these calcification pathways evolved independently in the Paleozoic.

      We do not argue that Miliolida and Rotallida utilize exactly the same calcification mechanism but the both groups use less divergent crystallization pathways, where mesocrystalline chamber walls are created by accumulating and assembling particles of pre-formed liquid amorphous mineral phase.

      Strengths:<br /> The authors document various unknown aspects of calcification of Pseudolachlanella eburnea and elucidate some poorly explained phenomena (e.g., translucent properties of the freshly formed test) however there are several problematic observations/interpretations which in my opinion should be carefully addressed.

      Weaknesses:

      (1) The authors (line 122) suggest that "characteristic autofluorescence indicates the carbonate content of the vesicles (Fig. S2), which are considered to be Mg-ACCs (amorphous MgCaCO3) (Fig. 2, Movies S4 and S5)". Figure S2 which the authors refer to shows only broken sections of organic sheath at different stages of mineralization. Movie S4 shows that only in a few regions some vesicles exhibit red autofluorescence interpreted as Mg-ACC (S5 is missing but probably the authors were referring to S3). In their previous paper (Dubicka et al 2023: Heliyon), the authors used exactly the same methodology to suggest that these are intracellularly formed Mg-rich amorphous calcium carbonate particles that transform into a stable mineral phase in rotaliid Aphistegina lessonii. However, in Figure 1D (Dubicka et al 2023) the apparently carbonate-loaded vesicles show the same red autofluorescence as the test, whereas in their current paper, no evidence of autofluorescence of Mg-ACC grains accumulated within the "gel-like" organic matrix is given. The S3 and S4 movies show circulation of various fluorescing components, but no initial phase of test formation is observable (numerous mineral grains embedded within the o rganic matrix - Figures 3A and B - should be clearly observed also as autofluorescence of the whole layer). Thus the crucial argument supporting the calcification model (Figure 5) is missing.

      This is correct that we did not observe the initial phase of test formation in vivo. Therefore, it is not our crucial argument supporting novel components of the new calcification model. We suspect that vesicles preparing and transporting Mg-ACC are produced way before their docking and deposition into the new wall, because such seawater vesicles were observed between the chamber formation stages (Goleń and Tyszka, 2024, personal communication based on independent experiments on a closely related miliolid taxon). It means that our in vivo experiments most likely represent a long, dynamic stage of vesicles formation via seawater endocytosis, their modification (incl. Mg-ACC formation) before the stage of exocytosis during the new chamber formation. Our crucial arguments supporting the calcification model come from the SEM imaging of the specimens fixed during chamber formation, as well as from the transparency of the new chamber wall during its progressive calcification.

      There is no support for the following interpretation (lines 199-203) "The existence of intracellular, vesicular intermediate amorphous phase (Mg-ACC pools), which supply successive doses of carbonate material to shell production, was supported by autofluorescence (excitation at 405 nm; Fig. 2; Movies S3 and S4; see Dubicka et al., 2023) and a high content of Ca and Mg quantified from the area of cytoplasm by SEM-EDS analysis (Fig. S6)."

      We used laser line 405nm and multiphoton excitaton to detect ACCs. These wavelengths (partly) permeate the shell to excite ACCs autofluorescence. The autofluorescence of the shells is present as well but not clearly visible in movieS4 as the fluorescence of ACCs is stronger. This may be related to the plane/section of the cell which is shown. The laser permeates the shell above the ACCs (short distance) but to excite the shell CaCO3 around foraminifera in the same three-dimensional section where ACCs are shown, the light must pass a thick CaCO3 area due to the three-dimensional structure of the foraminiferan shell. Therefore, the laser light intensity is reduced. In a revised version a movie/image with reduced threshold is shown.

      Author response image 1.

      Autofluorescence image of studied Miliolida species (exc. 405 nm) showing algal chlorophyll (blue) and CaCO3 (red), both ACC and calcite shell.

      It would be very convenient if it was possible to visualize ACC by illumination with a blacklight, but there are very many organic molecules that have an autofluorescence excited by ~405 nm. One of the examples is NADH (Lee et al., 2015. Kor J Physiol Pharmac 19(4): 373-382), an omnipresent molecule in any cell (couldn't copy the appropriate picture here, but the reference has a figure with the em/exc spectra).

      The paper of Lee et al. 2015 shows that the excitation spectrum of NADH is ending close to 400 nm. This means that NADH is not or only very weakly excitable at 405nm, what we used as the excitation laser line. 

      (2) The authors suggest that "no organic matter was detected between the needles of the porcelain structures (Figures 3E; 3E; S4C, and S5A)". Such a suggestion, which is highly unusual considering that biogenic minerals almost by definition contain various organic components, was made based only on FE-SEM observation. The authors should either provide clearcut evidence of the lack of organic matter (unlikely) or may suggest that intense calcium carbonate precipitation within organic matrix gel ultimately results in a decrease of the amount of the organic phase (but not its complete elimination), alike the pure calcium carbonate crystals are separated from the remaining liquid with impurities ("mother liquor"). On the other hand, if (249-250) "organic matrix involved in the biomineralization of foraminiferal shells may contain collagen-like networks", such "laminar" organization of the organic matrix may partly explain the arrangement of carbonate fibers parallel to the surface as observed in Fig. 3E1.

      We agree with the reviewer that biogenic minerals should by definition contain some organic components. We just wrote that "no organic matter was detected between the needles of the porcelain structures” that means that we did not detect any organic structures based only on our FE-SEM observations. We will rephrase this part of the text to avoid further confusion.

      (3) The author's observations indeed do not show the formation of individual skeletal crystallites within intracellular vesicles, however, do not explain either what is the structure of individual skeletal crystallites and how they are formed. Especially, what are the structures observed in polarized light (and interpreted as calcite crystallites) by De Nooijer et al. 2009? The author's explanation of the process (lines 213-216) is not particularly convincing "we suspect that the OM was removed from the test wall and recycled by the cell itself".

      Thank you for this comment. We will do our best to supplement our explanations. We are aware about the structures observed in polarized light by De Nooijer et al. (2009). However, Goleń et al. (2022, Prostist; + 2 other citations) showed that organic polymers may also exhibit light polarization. Additional experimental studies are needed to separate these types of polarization. We will try to investigate this issue in our future research.

      (4) The following passage (lines 296-304) which deals with the concept of mesocrystals is not supported by the authors' methodology or observations. The authors state that miliolid needles "assembled with calcite nanoparticles, are unique examples of biogenic mesocrystals (see Cölfen and Antonietti, 2005), forming distinct geometric shapes limited by planar crystalline faces" (later in the same passage the authors say that "mesocrystals are common biogenic components in the skeletons of marine organisms" (are they thus unique or are they common)? It is my suggestion to completely eliminate this concept here until various crystallographic details of the miliolid test formation are well documented.

      Our intension was to express that mesocrystals are common biogenic components in the skeletons of marine organisms however such a miliolid needles forming distinct geometric shapes limited by planar crystalline faces are unique.

      Reviewer #1 (Recommendations For The Authors):

      Below, I have summarized my main criticisms.

      (1) The movies S1-S4 do not indicate what is described. The videos show indeed seawater (S1), cell membranes (S2), and autofluorescence and acidic vesicles (S3 and S4). The presence of all these intracellular structures is not surprising: any eukaryotic cell will have those. The authors, however, claim that they participate in the process of calcification, which is simply not shown. One of the main arguments seems the presence of 'carbonate pools', in the caption these are even claimed to be 'Mg-ACC pools', but this is by no means revealed by an excitation of 405nm/ emission between 420 and 490 nm. It would be very convenient if it was possible to visualize ACC by illumination with a blacklight, but there are very many organic molecules that have an autofluorescence excited by ~405 nm. One of the examples is NADH (Lee et al., 2015. Kor J Physiol Pharmac 19(4): 373-382), an omnipresent molecule in any cell (couldn't copy the appropriate picture here, but the reference has a figure with the em/exc spectra).

      The paper of Lee et al. 2015 shows that the excitation spectrum of NADH is ending close to 400 nm. This means that NADH is not or only very weakly excitable at 405nm, what we used as the excitation laser line. 

      The fluorescence by this excitation/ emission couple unlikely indicates the vesicles in which these foraminifera calcify. Therefore, most of the interpretation of the authors on what happens with the calcitic needles is not based on results but remains pure speculation.

      The fluorescence autofluorescence upon excitation at 405 nm (emission 420–480 nm is typical for CaCO3 both for biocalcite and amorphous calcium carbonate, what was proven by laboratory synthesis of amorphous calcium carbonate (Dubicka et al., in preparation).

      (2) The results mention 'granules', which are the supposed Mg-ACC-containing vesicles, but the movies simply don't show any granules. Only fluorescence. Again, the results show a lot of vesicles with autofluorescence, but these are not necessarily related to calcification. Proof could be supplied by showing that the same fluorescent vesicles are 'used up' when the specimens under observation are making a new chamber, but until that is done, the fate of all these vesicles remains uncertain and once more, may not be involved in calcification at all.

      We suspect that vesicles preparing and transporting Mg-ACC are produced way before their docking and deposition into the new wall, because such seawater vesicles were observed between the chamber formation stages (Goleń and Tyszka, 2024, personal communication based on independent experiments on a closely related miliolid taxon). It means that our in vivo experiments most likely represent a long, dynamic stage of vesicles formation via seawater endocytosis, their modification (incl. Mg-ACC formation) before the stage of exocytosis during the new chamber formation. Our crucial arguments supporting the calcification model come from the SEM imaging of the specimens fixed during chamber formation, as well as from the transparency of the new chamber wall during its progressive calcification.

      (3) The Methods are unclear. How long were the foraminifers kept before being placed under the microscope? Were they fed with anything? This is important since the chlorophyll should not be from any food source. I didn't know that this foraminiferal species has photosynthetic symbionts: genera like Quinqueloculina don't. Is there any reference for this? Normally, I wouldn't care that much, but the authors find the presence of (facultative) symbionts important (lines 305-336). I am a bit suspicious about this since the only evidence for the presence of photosynthetic symbionts is because of the autofluorescence. As the authors said, commonly these miliolid species are regarded as symbiont-barren, so additional proof for these symbionts is necessary.

      We agree that additional proof is needed for the presence of photosynthetic symbionts. We rephrased the manuscript accordingly.

      (4) It is also unclear (Methods) at what stage the miliolids were photographed (Figure 3). How did chamber formation proceed, what was the timing of the photographs, etc. These pictures are to me the most interesting finding of this study, but need to be described much better.

      All individuals of living foraminifera were fixed at the overall stage of chamber formation. However, every individual presents a complete set of successive steps (substages) of chamber wall calcification fixed at once. Fig. 3A and B present nearly the most proximal (youngest) part of the new chamber with a thick wall of calcite nanograins within a gel-like organic matrix. Fig. 3C and D present a bit more distal (intermediate) part of the calcified chamber. Fig. 3E shows the most distal part of the new chamber. This part is anchored to the older, underlying solid calcified chamber (not shown in this figure). All these steps are synchronous, however, represent gradual successive stages of calcification. The main text and Figs 4 and 5 explain this phenomenon in details.

      There are many small issues with the text too. These include:

      Line 28/29: in many other groups, calcification is thought to be polyphyletic (e.g. sponges: Chombard et al., 1997. Biol Bull 193: 359-367).

      Corrected

      Line 29/30: there may be even more 'types of shells'. The first author has shown in earlier papers that nodosarids have a unique shell architecture. Spirillinids also seem to have their own way of calcification. It is unclear what is meant here by 'two contrasting models'.

      By now there are known only two models of foraminiferal calcification. Lagenida biocalcification has not been studied.

      Line 33: 'Both groups'? This paper only shows calcification in miliolids.

      However, we refer to previous study.

      Line 42: Perhaps, but there is no data on the pseudopodial network in this manuscript.

      We refer to Angell, 1980 studies

      Line 43: Likely, but that is not what this manuscript is showing.

      Line 42-44: The authors should make a choice and be clear. The point of this paper is that miliolids and rotalids calcify in ways that are actually not as different as they seemed previously. Still, they are said to have different 'chamber formation modes'. If they are calcifying in a similar way (which I think is not necessarily supported by the results), isn't calcification in these groups like variations on the same theme? How does this relate to the independent origins of calcification within these two groups?

      Our intension is to show that Miliolida and Rotaliida utilize less divergent calcification pathways, following the recently discovered biomineralization principles.

      Line 49-51: is this a well-established distinction? If so, please add a reference. If not: what is fundamentally different between B and C? Does only the size of the intracellular vesicle matter?

      Rephrased

      Line 60: please include a reference for the intracellular calcification by coccolithophores.

      Added

      Line 67: this is wrong. It is the alignment of the needles at the surface that makes them all reflect light in the same way and gives the shells a porcelaneous appearance. A close-up of the miliolid's shell surface shows this arrangement. Underneath this layer, the orientation of the needles is more random.

      We referred to Johan Hohenegger papers.

      Line 114: how else?

      Line 114-116: I don't see the relevance here. If seawater is taken up, the vesicle containing this seawater has to have a membrane around it. By definition. The text here ('These vesicles') suggests that Calcein and FM1-43 were combined (which they easily could have), but the methods describe that they are used successively.

      Yes, we used two dyes separately.

      Lines 122-130: I think the interpretation of this autofluorescence signal is wrong. Even if it was true, these lines belong to the Discussion.

      This paragraph has been placed within discussion

      Line 138: What are 'mobile clusters'? I don't see a relation between the location of the symbionts and the other vesicles (Figure 2).

      Line 147-148: How can an SEM image show the absence of organic matter?

      We meant the absence of the gel-like OM visible in the previous stages of the chamber formation

      Line 148: Should be 'Figs. 3E; 3E1; S4C'.

      Corrected

      Lines 143-150: this can be merged with the following paragraph.

      Done

      Lines 151-169: why is there no indication of the time? Figures 3 and 4 link the pictures in time to show the development of the growing chamber wall. However, neither here nor in the methods, is there any recording of the time after the beginning of chamber formation. Now, the images are linked (Figure 4) as if they were taken at regular intervals, but this is not documented.

      Lines 170-184: this should go to the Discussion.

      Done

      Line 193-195: this is likely, but not visible in Figure 1.

      It was visible by optical microscopy and described by Angell, 1980

      Line 199-201: I don't understand this: the fluorescent vesicles were not observed during chamber formation so any link between the SEM and CLSM scans remains pure speculation.

      Line 203-204: needed for what?

      For better documentation of Miliolid ACC-bearing granules

      Line 220: is this shown in any of the images? 

      Angell, 1980

      Line 230: It sounds nice, but I don't think a 'paradigm shift' is appropriate here. However interesting and important foraminiferal biomineralization is, the authors show that the crystals of miliolids are likely formed differently than previously thought. If this is a 'paradigm shift', then most scientific findings are.

      In our opinion this is definitely a shift of paradigm

      Line 231: I don't think anyone suggested miliolids and coccolithophores share 'the same' pathway. They are shown (cocco's) and thought (miliolids) to secrete their calcite intracellularly.

      Changed to similar, intracellular

      Line 258: References should only be to peer-reviewed studies.

      Line 430: Burgers'

      Corrected

      Reviewer #2 (Recommendations For The Authors):

      Please separate clearly the results (observations) from the discussion (interpretations): various interpretational/commentary phrases should be removed from the Results section to Discussion e.g., lines 124-130, 131-135.

      Interpretation have been separated from results as suggested by Reviewer.

      [line 49] " living cells have evolved three major skeleton crystallization pathways". I would rather say "organisms" not "cells" as the coordination of the calcification process in multicellular organisms clearly involves processes that are beyond the individual cell activity.

      Corrected

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Reviewer #1 (Public Review):

      Original comment: There is no explanation for how this work could be a breakthrough in simulation gregarious feeding as is stated in the manuscript.

      Reviewer response: I think I understand where the authors are trying to take this next step. If the authors were to follow up on this study with the proposed implementation of inhalant/exhalent velocities profiles (or more preferably velocity/pressure fields), then that study would be a breakthrough in simulating such gregarious feeding. Based on what has been done within the present study, I think the term "breakthrough" is instead overly emphatic. An additional note on this. The authors are correct that incorporating additional models could be used to simulation a population (as has been successfully done for several Ediacaran taxa despite computational limitations), but it's not the only way. The authors 1 might explore using periodic boundary conditions on the external faces of the flow domain. This could require only a single Olivooid model to assess gregarious impacts - see the abundant literature of modeling flow through solar array fields.

      We appreciate the reviewer 1 for the suggestion. Modeling gregarious feeding via periodic boundary conditions is surely a practical way with limited computational resources. Modeling flow through solar array fields can also be an inspiring case. However, to realism the simulation of gregarious feeding behavior on an uneven seabed and with irregular organism spatial distribution, just using periodic boundary conditions may not be sufficient (see Author response image 1 for a simple example). We will go on exploring the way of realizing the simulations of large-scale gregarious feeding.

      Author response image 1.

      An example of modeling gregarious feeding behavior on an uneven seabed.

      Original comment: The claim that olivooid-type feeding was most likely a prerequisite transitional form to jet-propelled swimming needs much more support or needs to be tailored to olivooids. This suggests that such behavior is absent (or must be convergent) before olivooids, which is at odds with the increasing quantities of pelagic life (whose modes of swimming are admittedly unconstrained) documented from Cambrian and Neoproterozoic deposits. Even among just medusozoans, ancestral 1 state reconstruction suggests that they would have been swimming during the Neoproterozoic (Kayal et al., 2018; BMC Evolutionary Biology) with no knowledge of the mechanics due to absent preservation. Author response: Thanks for your suggestions. Yes, we agree with you that the ancestral swimming medusae may appear before the early Cambrian, even at the Neoproterozoic deposits. However, discussions on the affinities of Ediacaran cnidarians are severely limited because of the lack of information concerning their soft anatomy. So, it is hard to detect the mechanics due to absent preservation. Olivooids found from the basal Cambrian Kuanchuanpu Formation can be reasonably considered as cnidarians based on their radial symmetry, external features, and especially the internal anatomies (Bengtson and Yue 1997; Dong et al. 2013; 2016; Han et al. 2013; 2016; Liu et al. 2014; Wang et al. 2017; 2020; 2022). The valid simulation experiment here was based on the soft tissue preserved in olivooids.

      Reviewer response: This response does not sufficiently address my earlier comment. While the authors are correct that individual Ediacaran affinities are an area of active research and that Olivooids can reasonably be considered cnidarians, this doesn't address the actual critique in my comment. Most (not all) Ediacaran soft-bodied fossils are considered to have been benthic, but pelagic cnidarian life is widely acknowledged to at least be present during later White Sea and Nama assemblages (and earlier depending on molecular clock interpretations). The authors have certainly provided support for the mechanics of this type of feeding being co-opted for eventual jet propulsion swimming in Olivooids. They have not provided sufficient justifications within the manuscript for this to be broadened beyond this group.

      Thanks for your sincere commentary. We of course agree with the possibility of the emergence of swimming cnidarians before the lowermost Cambrian Fortunian Stage. See lines 16-129: “Ediacaran fossil assemblages with complex ecosystems consist of exceptionally preserved soft-bodied eukaryotes of enigmatic morphology, which their affinities are mostly unresolved (Tarhan et al., 2018, Integrative and Comparative Biology, 58 (4), 688–702; Evans et al., 2022, PNAS, 11(46), e220747511).” Undoubtedly Olivooids belong to cnidarians charactered by their external and internal biological structures. Limited by the fossil records, we could only speculate on the transition from the benthic to the swimming of ancestral cnidarians via the valid fossil preservation, e.g. olivooids. The transition may require processes such as increasing body size, thickening the mesoglea, and degenerating the periderm, etc. And these processes may also evolve independently or comprehensively. Moreover, the ecological behaviors of the ancestral cnidarians may evolve independently at different stages from Ediacaran to Cambrian. We therefore could not provide more sufficient justifications beyond olivooids.

      Original comment: L446: two layers of hexahedral elements is a very low number for meshing boundary layer flow

      Reviewer response: As the authors point out in the main text, these organisms are small (millimeters in scale) and certainly lived within the boundary layer range of the ocean. While the boundary layer is not the main point, it still needs to be accurately resolved as it should certainly affect the flow further towards the far field at this scale. I'm not suggesting the authors need to perfectly resolve the boundary layer or focus on using turbulence models more tailored to boundary layer flows (such as k-w), but the flow field still needs sufficient realism for a boundary bounded flow. The authors really should consider quantitatively assessing the number of hexahedral elements within their mesh refinement study.

      To address this concern, we run another four simulations based on mesh4 within our mesh refinement study to assess the number of hexahedral elements (five layers and eight layers of hexahedral elements with different thickness of boundary layer mesh (controlled by thickness adjustment factor), respectively). the results had been supplemented to Table supplement 2. As shown in the results, the number of layers of hexahedral elements seems does not significant influence the result, but the thickness of boundary layer mesh can influence the maximum flow velocity of the contraction phase. However, the results of all the simulations were generally consistent, as shown in Author response image 2. The description of the results above were added to section “Mesh sensitivity analysis”.

      Author response image 2.

      Results of mesh refinement study of different boundary layer mesh parameters.

    2. eLife assessment

      This important study advances our understanding of early Cambrian cnidarian paleoecology and suggests that the reconstructed ancestral feeding and respiration mechanisms predate jet-propelled swimming utilized by modern jellyfish. The work combines solid evidence of fluid and structural mechanics modeling, simulating for the first time the feeding and respiratory capacities in a microfossil (Quadrapyrgites), which in turn opens new possibilities using this approach for paleontological research. Assuming that the prior interpretations and assumptions concerning the modeled organism's soft part and skeletal anatomy are correct, the hypotheses that (1) the organism could alternately contract and expand the oral region and (2) such movement increased feeding efficiency seem plausible.

    3. Reviewer #1 (Public Review):

      Summary:

      The authors utilize fluid-structure interaction analyses to simulate fluid flow within and around the Cambrian cnidarian Quadrapyrgites to reconstruct feeding/respiration dynamics. Based on vorticity and velocity flow patterns, the authors suggest that the polyp expansion and contraction ultimately develop vortices around the organism that are like what modern jellyfish employ for movement and feeding. Lastly, the authors suggest that this behavior is likely a prerequisite transitional form to swimming medusae.

      Strengths:

      While fluid-structure-interaction analyses are common in engineering, physics, and biomedical fields, they are underutilized in the biological and paleobiological sciences. Zhang et al. provide a strong approach to integrating active feeding dynamics into fluid flow simulations of ancient life. Based on their data, it is entirely likely the described vortices would have been produced by benthic cnidarians feeding/respiring under similar mechanisms. However, some of the broader conclusions require additional justification.

      Weaknesses:

      (1) The claim that olivooid-type feeding was most likely a prerequisite transitional form to jet-propelled swimming needs much more support or needs to be tailored to olivooids. This suggests that such behavior is absent (or must be convergent) before olivooids, which is at odds with the increasing quantities of pelagic life (whose modes of swimming are admittedly unconstrained) documented from Cambrian and Neoproterozoic deposits. Even among just medusozoans, ancestral state reconstruction suggests that they would have been swimming during the Neoproterozoic (Kayal et al., 2018; BMC Evolutionary Biology) with no knowledge of the mechanics due to absent preservation.<br /> (2) While the lack of ambient flow made these simulations computationally easier, these organisms likely did not live in stagnant waters even within the benthic boundary layer. The absence of ambient unidirectional laminar current or oscillating current (such as would be found naturally) biases the results.<br /> (3) There is no explanation for how this work could be a breakthrough in simulation gregarious feeding as is stated in the manuscript.

      Despite these weaknesses the authors dynamic fluid simulations convincingly reconstruct the feeding/respiration dynamics of the Cambrian Quadrapyrgites, though the large claims of transitionary stages for this behavior are not adequately justified. Regardless, the approach the authors use will be informative for future studies attempting to simulate similar feeding and respiration dynamics.

    4. Reviewer #2 (Public Review):

      Summary:

      The authors seek to elucidate the early evolution of cnidarians through computer modeling of fluid flow in the oral region of very small, putative medusozoan polyps. They propose that the evolutionary advent of the free-swimming medusoid life stage was preceded by a sessile benthic life stage equipped with circular muscles that originally functioned to facilitate feeding and that later became co-opted for locomotion through jet propulsion.

      Strengths:

      Assumptions of the modeling exercise laid out clearly; interpretations of the results of the model runs in terms of functional morphology plausible. An intriguing investigation that should stimulate further discussion and research.

      Weaknesses:

      Speculation on the origin of the medusoid life stage in cnidarians heavily dependent on prior assumptions concerning the soft part anatomy and material properties of the skeleton of the modeled fossil organism that may be open to alternative interpretations. Logically, of course, the hypothesis that cnidarian medusae originated from benthic polyps must be evaluated along with the alternative hypotheses that the medusa came first and that the ancestral cnidarian exhibited both life stages.

    1. Author response:

      The following is the authors’ response to the original reviews.

      The points raised let us critically rethink our approach, our results, and our conclusions. Furthermore, it gave us the chance to elaborate on some critical aspects that were mentioned. With the help of the reviewers, we made some clarifications in the point-by-point responses and implemented them in the manuscript. Furthermore, we modified the figures as suggested:

      - The colors in Figure 1C, D, G and H have been adapted as suggested

      - We added a Figure2-figure supplement 1, which strengthens our conclusion in Figure 2

      - As asked by reviewer #1 (weaknesses #3), we added the data about neutrophil numbers in the different organs (Figure 6-figure supplement 3C).

      Reviewer #1 (Public Review):

      Summary:

      - Extracellular ATP represents a danger-associated molecular pattern associated to tissue damage and can act also in an autocrine fashion in macrophages to promote proinflammatory responses, as observed in a previous paper by the authors in abdominal sepsis. The present study addresses an important aspect possibly conditioning the outcome of sepsis that is the release of ATP by bacteria. The authors show that sepsis-associated bacteria do in fact release ATP in a growth dependent and strain-specific manner. However, whether this bacterial derived ATP play a role in the pathogenesis of abdominal sepsis has not been determined. To address this question, a number of mutant strains of E. coli has been used first to correlate bacterial ATP release with growth and then, with outer membrane integrity and bacterial death. By using E. coli transformants expressing the ATP-degrading enzyme apyrase in the periplasmic space, the paper nicely shows that abdominal sepsis by these transformants results in significantly improved survival. This effect was associated with a reduction of peritoneal macrophages and CX3CR1+ monocytes, and an increase in neutrophils. To extrapolate the function of bacterial ATP from the systemic response to microorganisms, the authors exploited bacterial OMVs either loaded or not with ATP to investigate the systemic effects devoid of living microorganisms. This approach showed that ATP-loaded OMVs induced degranulation of neutrophils after lysosomal uptake, suggesting that this mechanism could contribute to sepsis severity.

      Strengths:

      - A strong part of the study is the analysis of E. coli mutants to address different aspects of bacterial release of ATP that could be relevant during systemic dissemination of bacteria in the host.

      We want to thank the reviewer for recognizing this important aspect of our experimental approach.

      Weaknesses:

      - As pointed out in the limitations of the study whether ATP-loaded OMVs provide a mechanistic proof of the pathogenetic role of bacteria-derived ATP independently of live microorganisms in sepsis is interesting but not definitively convincing. It could be useful to see whether degranulation of neutrophils is differentially induced by apyrase-expressing vs control E. coli transformants.

      We thank the reviewer for raising several important points. In our study, we assessed local and systemic effects of released bacterial ATP. The consequences of local bacterial ATP release were assessed using an apyrase-expressing E. coli transformant. Locally, bacterial ATP resulted in a decrease in neutrophil numbers and we hypothesize that directly released bacterial ATP either leads to neutrophil death (e.g. via P2X7 receptor (Proietti et al., 2019)) or interferes with the recruitment of neutrophils (e.g. via P2Y receptors (Junger, 2011)).

      The systemic consequences were assessed using ATP-loaded and empty OMV. We have shown that degranulation is induced by OMV-derived bacterial ATP. ATP-containing OMV are engulfed by neutrophils, reach its endolysosomal compartment and might activate purinergic receptors, which then lead to aberrant degranulation. This concept, that needs to be explored in future studies, is fundamentally different from classical purinergic signaling via directly released bacterial ATP into the extracellular space.

      It is possible that neutrophil degranulation is also modulated by directly released bacterial ATP. We agree that this should be assessed in future studies. Also, the role of OMV-derived bacterial ATP should be assessed locally as well as the importance of directly released vs. OMV-mediated bacterial ATP dissected locally. Based on our measurements (Figure 4-figure supplement 1A and Figure 5C), we estimate that the effect of OMV-derived bacterial ATP might be much smaller than the effects of directly released bacterial ATP. Thus, direct ATP release might predominate locally. However, we fully agree that this has to be investigated in a future study to reconcile the different aspects of bacterial ATP signaling. A paragraph will be added to the manuscript, in which we discuss this particular issue.

      - Also, the increase of neutrophils in bacterial ATP-depleted abdominal sepsis, which has better outcomes than "ATP-proficient" sepsis, seems difficult to correlate to the hypothesized tissue damage induced by ATP delivered via non-infectious OMVs.

      We fully acknowledge the mentioned discrepancy. What we propose is that bacterial ATP exhibits different functions that are dependent on the release mechanism (see above). Locally, in the peritoneal cavity, neutrophil numbers are decreased by directly released bacterial ATP. Remotely, ATP is delivered via OMV and impacts on neutrophil function. We agree that, in particular, in the peritoneal cavity, both effects may play a role. However, the impact of directly released bacterial ATP seems to be dominant (see above).

      We propose that neutrophils are decreased locally because of directly released bacterial ATP, which prevents efficient infection control and, therefore, impairs sepsis survival. In addition, these fewer neutrophils might even be dysregulated by the engulfment of bacterial ATP delivered via OMV, which leads to an upregulated and possibly aberrant degranulation process worsening local and remote tissue damage. We agree that in addition to neutrophil numbers, the function of local neutrophils should be assessed with and without the influence of OMV-delivered bacterial ATP. This could be done by RNA sequencing of primary neutrophils from the peritoneal cavity or neutrophil cell lines as well as degranulation assays.

      - Are the neutrophils counts affected by ATP delivered via OMVs?

      This is difficult to show in the peritoneal cavity where we have both, directly released bacterial ATP and OMV-derived bacterial ATP. We assessed such putative difference, however, for the systemic organs and the blood, where we did not find any differences in neutrophil numbers.

      Author response image 1.

      - A comparison of cytokine profiles in the abdominal fluids of E. coli and OMV treated animals could be helpful in defining the different responses induced by OMV-delivered vs bacterial-released ATP. The analyses performed on OMV treated versus E. coli infected mice are not closely related and difficult to combine when trying to draw a hypothesis for bacterial ATP in sepsis.

      We fully agree that there are several open questions that remain to be elucidated, in particular, to differentiate the local role of directly released versus OMV-delivered bacterial ATP. In this study, we laid the foundation for future in vivo research to examine the specific role of bacterial ATP in sepsis. Such future research avenues might be to investigate the local effects of OMV-delivered bacterial ATP, and how neutrophil migration, apoptosis and degranulation are altered. We agree that exploration of the local secretory immune response and cytokine profiles are relevant to understand the different mechanisms of how bacterial ATP alters sepsis. However, such experiments should be ideally performed in systems where the source and the delivery of ATP can be modulated locally.

      - Also it was not clear why lung neutrophils were used for the RNAseq data generation and analysis.

      Thank you for this remark. We have chosen primary lung neutrophils for four reasons:

      (1) Isolation of primary lung neutrophils allowed us to assess an in vivo response that would not have been possible with cell lines.

      (2) The lung and the respiratory system are among the clinically most important organs affected during sepsis resulting in a significant cause of mortality.

      (3) We show in Figure 6C that specifically in the lung, OMV are engulfed by neutrophils, which shows the relevance of the lung also in our study context.

      (4) And finally, lung neutrophils were chosen to examine specifically distant and not local effects.

      Reviewer #2 (Public Review):

      Summary:

      - In their manuscript "Released Bacterial ATP Shapes Local and Systemic Inflammation during Abdominal Sepsis", Daniel Spari et al. explored the dual role of ATP in exacerbating sepsis, revealing that ATP from both host and bacteria significantly impacts immune responses and disease progression.

      Strengths:

      - The study meticulously examines the complex relationship between ATP release and bacterial growth, membrane integrity, and how bacterial ATP potentially dampens inflammatory responses, thereby impairing survival in sepsis models. Additionally, this compelling paper implies a concept that bacterial OMVs act as vehicles for the systemic distribution of ATP, influencing neutrophil activity and exacerbating sepsis severity.

      We thank the reviewer for mentioning these key points and supporting the relevance of our study.

      Weaknesses:

      (1) The researchers extracted and cultivated abdominal fluid on LB agar plates, then randomly picked 25 colonies for analysis. However, they did not conduct 16S rRNA gene amplicon sequencing on the fluid itself. It is worth noting that the bacterial species present may vary depending on the individual patients. It would be beneficial if the authors could specify whether they've verified the existence of unculturable species capable of secreting high levels of Extracellular ATP.

      Most septic complications are caused by a limited spectrum of bacteria, belonging mainly either to the Firmicutes or the Proteobacteria phyla, including E. coli, K. pneumoniae, S. aureus or E. faecalis (Diekema et al., 2019; Mureșan et al., 2018). We validated this well documented existing evidence by randomly assessing 25 colonies. For the planned experiments, it was crucial to work with culturable bacteria; otherwise, ATP measurements, the modulation of ATP generation or loading of OMV would not have been possible. Using such culturable bacteria allowed us to describe mechanisms of ATP release.

      We fully agree that hard-to-culture or unculturable bacteria might contribute significantly to septic complications. This, however, would need to be explored in future studies using extensive culturing methods (Cheng et al., 2022).

      (2) Do mice lacking commensal bacteria show a lack of extracellular ATP following cecal ligation puncture?

      ATP is typically secreted by many cells of the host in active and passive manners in the case of any injury, including cecal ligation and puncture (Burnstock, 2016; Dosch et al., 2018; Eltzschig et al., 2012; Idzko et al., 2014). We hypothesize that bacterial ATP is a potential priming agent at early stages of sepsis, and indeed, at such early time points, a comparison of peritoneal ATP levels between germfree and colonized mice could support our hypothesis. Future studies addressing this question must, however, correct for the different immune responses between germ-free and colonized mice. This is of utmost importance, especially for the cecal ligation and puncture model, since the cecum of germ-free mice is extremely large, making such experiments hard to control.

      (3) The authors isolated various bacteria from abdominal fluid, encompassing both Gram-negative and Gram-positive types. Nevertheless, their emphasis appeared to be primarily on the Gram-negative E. coli. It would be beneficial to ascertain whether the mechanisms of Extracellular ATP release differ between Gram-positive and Gram-negative bacteria. This is particularly relevant given that the Gram-positive bacterium E. faecalis, also isolated from the abdominal fluid, is recognized for its propensity to release substantial amounts of Extracellular ATP.

      We fully agree with this comment. In this paper, we used E. coli as our model organism to determine the principles of sepsis-associated bacterial ATP release and therefore focused on gram-negative bacteria. In addition to the direct, growth-dependent release, we found a relevant impact of OMV-delivered bacterial ATP. For this latter purpose, a gram-negative strain, in which OMV generation has been well described (Schwechheimer & Kuehn, 2015), was chosen. Recently, gram-positive bacteria have been shown to secrete ATP and OMV as well (Briaud & Carroll, 2020; Hironaka et al., 2013; Iwase et al., 2010). Given the fundamental differences in the structure of the cell wall of gram-positive bacteria and the mechanisms of OMV generation and release, future studies are required to assess the relevance of directly released and OMV-delivered ATP in gram-positive bacteria.

      (4) The authors observed changes in the levels of LPM, SPM, and neutrophils in vivo. However, it remains uncertain whether the proliferation or migration of these cells is modulated or inhibited by ATP receptors like P2Y receptors. This aspect requires further investigation to establish a convincing connection.

      We fully agree with this comment. The decrease in LPM and the consequential predomination of SPM have been well described after inflammatory stimuli in the context of the macrophage disappearance reaction (Ghosn et al., 2010). Also, it has been shown that purinergic signaling modulates infiltration of neutrophils and can lead to cell death as a consequence of  P2Y and P2X receptor activation (Junger, 2011; Proietti et al., 2019). In our study, we propose that intracellular purinergic receptors contribute to neutrophil function during sepsis. After introducing the general principles and fundaments of bacterial ATP with our studies, we fully agree that additional experiments need to address downstream purinergic receptor activation. That, however, would go beyond the scope of our study.

      (5) Additionally, is it possible that the observed in vivo changes could be triggered by bacterial components other than Extracellular ATP? In this research field, a comprehensive collection of inhibitors is available, so it is desirable to utilize them to demonstrate clearer results.

      This question is of utmost importance and defined the choice of our model and experimental approach. When we started the project, we used two different E. coli mutants that release low (ompC) and high (eaeH) amounts of ATP. However, the limitation of this approach is that these are different bacteria, which may also differ in the components they secrete or the surface proteins they express. We, therefore, decided against that approach. With the approach we finally used (same bacterium, just with and without ATP), we aimed to minimize the influence of non-ATP bacterial components.

      (6) Have the authors considered the role of host-derived Extracellular ATP in the context of inflammation?

      Yes, the role of host-derived extracellular ATP in inflammation and sepsis is well-established with contradictory results (Csóka et al., 2015; Ledderose et al., 2016). This conflicting data was the rationale to test the relevance of bacterial ATP. We suggest that bacterial ATP is essential in the early phase of sepsis when bacteria invade the sterile compartment and before efficient host response, including the eukaryotic release of ATP, is established.

      (7) The authors mention that Extracellular ATP is rapidly hydrolyzed by ectonucleotases in vivo. Are the changes of immune cells within the peritoneal cavity caused by Extracellular ATP released from bacterial death or by OMVs?

      This is a relevant question that was also asked by reviewer #1, and we answered it in detail above (weaknesses comment #1 and #2). From our ATP measurements (Figure 4-figure supplement 1A and Figure 5C), we conclude that locally, the role of directly released bacterial ATP (extracellular) predominates over OMV-derived bacterial ATP. Furthermore, the mechanisms between directly released and OMV-derived bacterial ATP (within OMV, engulfed and transported to the endolysosomal compartment) are different, and especially extracellular ATP has been described to lead to apoptosis via P2X7 signaling.

      (8) In the manuscript, the sample size (n) for the data consistently remains at 2. I would suggest expanding the sample size to enhance the robustness and rigor of the results.

      Two biological replicates (independent cultures) were only used for the bacteria cultures in Figure 1, Figure 2, and Figure 3, which achieved similar results and the standard deviation remained very small, indicating its robustness. In the in vitro experiments in Figure 5 we used a sample size of 6 (three biological replicates measured in technical duplicates), since we saw bigger deviations in our measurements. For the in vivo experiments, we always used 5 or more animals in at least two independent experiments.

      Reviewer #2 (Recommendations For The Authors):

      (9). Line 37: 11 million sepsis-related deaths were reported "in" 2017.

      The passage has been corrected as suggested.

      (10) By the way, the similar colors used in Figure 1C and G are too chaotic, making it difficult to distinguish.

      We agree, the colors have been adapted.

      Author response image 2.

      (11). All "in vivo" and "in vitro" should be italicized.

      We italicized all of them.

      (12). The title of Figure 4 is confusing: "Impairs sepsis outcome in vivo?" Could you make it more specific?

      We agree, the title has been rephrased:

      “Bacterial ATP reduces neutrophil counts and reduces survival in a mouse model of abdominal sepsis.”

      (13) Line 314-316: The sentence "Potentially, despite the lack of a transporter, ATP may similarly to eukaryotic cells leak (Yegutkin et al., 2006) across the inner membrane into the periplasmic space that lacks the enzymes for ATP generation." sounds odd.

      This passage was reformulated in the manuscript.

      “Despite the lack of a transporter, ATP may leak across the inner membrane into the periplasmic space. Such leakage may be similar to baseline leakage in eukaryotic cells (Yegutkin et al., 2006).”

      (14) The numerical notation in the paper is odd: sometimes it uses a prime symbol as a superscript (such as line 504), and sometimes it does not (such as line 421). Should it be standardized to "3,200" and "150,000"?

      Thank you for this remark. The numbers have been standardized throughout the manuscript.

      (15) Line "0.4 mm EP cuvettes" should be "0.4 cm EP cuvettes"

      The specified passage has been corrected as suggested.

      References

      Briaud, P., & Carroll, R. K. (2020). Extracellular Vesicle Biogenesis and Functions in Gram-Positive Bacteria. Infection and Immunity, 88(12), 10.1128/iai.00433-20. https://doi.org/10.1128/iai.00433-20

      Burnstock, G. (2016). P2X ion channel receptors and inflammation. Purinergic Signalling, 12(1), 59–67. https://doi.org/10.1007/s11302-015-9493-0

      Cheng, A. G., Ho, P.-Y., Aranda-Díaz, A., Jain, S., Yu, F. B., Meng, X., Wang, M., Iakiviak, M., Nagashima, K., Zhao, A., Murugkar, P., Patil, A., Atabakhsh, K., Weakley, A., Yan, J., Brumbaugh, A. R., Higginbottom, S., Dimas, A., Shiver, A. L., … Fischbach, M. A. (2022). Design, construction, and in vivo augmentation of a complex gut microbiome. Cell, 185(19), 3617-3636.e19. https://doi.org/10.1016/j.cell.2022.08.003

      Csóka, B., Németh, Z. H., Törő, G., Idzko, M., Zech, A., Koscsó, B., Spolarics, Z., Antonioli, L., Cseri, K., Erdélyi, K., Pacher, P., & Haskó, G. (2015). Extracellular ATP protects against sepsis through macrophage P2X7 purinergic receptors by enhancing intracellular bacterial killing. The FASEB Journal, 29(9), 3626–3637. https://doi.org/10.1096/fj.15-272450

      Diekema, D. J., Hsueh, P.-R., Mendes, R. E., Pfaller, M. A., Rolston, K. V., Sader, H. S., & Jones, R. N. (2019). The Microbiology of Bloodstream Infection: 20-Year Trends from the SENTRY Antimicrobial Surveillance Program. Antimicrobial Agents and Chemotherapy, 63(7), e00355-19. https://doi.org/10.1128/AAC.00355-19

      Dosch, M., Gerber, J., Jebbawi, F., & Beldi, G. (2018). Mechanisms of ATP Release by Inflammatory Cells. International Journal of Molecular Sciences, 19(4), 1222. https://doi.org/10.3390/ijms19041222

      Eltzschig, H. K., Sitkovsky, M. V., & Robson, S. C. (2012). Purinergic Signaling during Inflammation. New England Journal of Medicine, 367(24), 2322–2333. https://doi.org/10.1056/NEJMra1205750

      Ghosn, E. E. B., Cassado, A. A., Govoni, G. R., Fukuhara, T., Yang, Y., Monack, D. M., Bortoluci, K. R., Almeida, S. R., Herzenberg, L. A., & Herzenberg, L. A. (2010). Two physically, functionally, and developmentally distinct peritoneal macrophage subsets. Proceedings of the National Academy of Sciences, 107(6), 2568–2573. https://doi.org/10.1073/pnas.0915000107

      Hironaka, I., Iwase, T., Sugimoto, S., Okuda, K., Tajima, A., Yanaga, K., & Mizunoe, Y. (2013). Glucose Triggers ATP Secretion from Bacteria in a Growth-Phase-Dependent Manner. Applied and Environmental Microbiology, 79(7), 2328–2335. https://doi.org/10.1128/AEM.03871-12

      Idzko, M., Ferrari, D., & Eltzschig, H. K. (2014). Nucleotide signalling during inflammation. Nature, 509(7500), 310–317. https://doi.org/10.1038/nature13085

      Iwase, T., Shinji, H., Tajima, A., Sato, F., Tamura, T., Iwamoto, T., Yoneda, M., & Mizunoe, Y. (2010). Isolation and Identification of ATP-Secreting Bacteria from Mice and Humans. Journal of Clinical Microbiology, 48(5), 1949–1951. https://doi.org/10.1128/JCM.01941-09

      Junger, W. G. (2011). Immune cell regulation by autocrine purinergic signalling. Nature Reviews Immunology, 11(3), 201–212. https://doi.org/10.1038/nri2938

      Ledderose, C., Bao, Y., Kondo, Y., Fakhari, M., Slubowski, C., Zhang, J., & Junger, W. G. (2016). Purinergic Signaling and the Immune Response in Sepsis: A Review. Clinical Therapeutics, 38(5), 1054–1065. https://doi.org/10.1016/j.clinthera.2016.04.002

      Mureșan, M. G., Balmoș, I. A., Badea, I., & Santini, A. (2018). Abdominal Sepsis: An Update. The Journal of Critical Care Medicine, 4(4), 120–125. https://doi.org/10.2478/jccm-2018-0023

      Proietti, M., Perruzza, L., Scribano, D., Pellegrini, G., D’Antuono, R., Strati, F., Raffaelli, M., Gonzalez, S. F., Thelen, M., Hardt, W.-D., Slack, E., Nicoletti, M., & Grassi, F. (2019). ATP released by intestinal bacteria limits the generation of protective IgA against enteropathogens. Nature Communications, 10(1), Article 1. https://doi.org/10.1038/s41467-018-08156-z

      Schwechheimer, C., & Kuehn, M. J. (2015). Outer-membrane vesicles from Gram-negative bacteria: Biogenesis and functions. Nature Reviews Microbiology, 13(10), 605–619. https://doi.org/10.1038/nrmicro3525

    2. eLife assessment

      This fundamental study advances our understanding of the role of bacterial-derived extracellular ATP in the pathogenesis of sepsis. The evidence supporting the conclusions is compelling, although not all concerns from a previous round of reviews were adequately addressed. The work will be of broad interest to researchers on microbiology and infectious diseases.

    3. Reviewer #2 (Public Review):

      Summary:

      In their manuscript, Daniel Spari et al. explored the dual role of ATP in exacerbating sepsis, revealing that ATP from both host and bacteria significantly impacts immune responses and disease progression.

      Strengths:

      The study meticulously examines the complex relationship between ATP release and bacterial growth, membrane integrity, and how bacterial ATP potentially dampens inflammatory responses, thereby impairing survival in sepsis models. Additionally, this compelling paper implies a concept that bacterial OMVs act as vehicles for the systemic distribution of ATP, influencing neutrophil activity and exacerbating sepsis severity.

      Weaknesses:

      (1) The researchers extracted and cultivated abdominal fluid on LB agar plates, then randomly picked 25 colonies for analysis. However, they didn't conduct 16S sequencing on the fluid itself. It's worth noting that the bacterial species present may vary depending on the individual patients. It would be beneficial if the authors could specify whether they've verified the existence of unculturable species capable of secreting high levels of Extracellular ATP.

      (2) Do mice lacking commensal bacteria show a lack of Extracellular ATP following cecal ligation puncture?

      (3) The authors isolated various bacteria from abdominal fluid, encompassing both Gram-negative and Gram-positive types. Nevertheless, their emphasis appeared to be primarily on the Gram-negative E. coli. It would be beneficial to ascertain whether the mechanisms of Extracellular ATP release differ between Gram-positive and Gram-negative bacteria. This is particularly relevant given that the Gram-positive bacterium E. faecalis, also isolated from the abdominal fluid, is recognized for its propensity to release substantial amounts of Extracellular ATP.

      (4) The authors observed changes in the levels of LPM, SPM, and neutrophils in vivo. However, it remains uncertain whether the proliferation or migration of these cells is modulated or inhibited by ATP receptors like P2Y receptors. This aspect requires further investigation to establish a convincing connection.

      (5) Additionally, is it possible that the observed in vivo changes could be triggered by bacterial components other than Extracellular ATP? In this research field, a comprehensive collection of inhibitors is available, so it is desirable to utilize them to demonstrate clearer results.

      (6) Have the authors considered the role of host-derived Extracellular ATP in the context of inflammation?

      (7) The authors mention that Extracellular ATP is rapidly hydrolyzed by ectonucleotases in vivo. Are the changes of immune cells within the peritoneal cavity caused by Extracellular ATP released from bacterial death or by OMVs?

      (8) In the manuscript, the sample size (n) for the data consistently remains at 2. I would suggest expanding the sample size to enhance the robustness and rigor of the results.

    4. Reviewer #1 (Public Review):

      Summary:

      Extracellular ATP represents a danger-associated molecular pattern associated to tissue damage and can act also in an autocrine fashion in macrophages to promote proinflammatory responses, as observed in a previous paper by the authors in abdominal sepsis. The present study addresses an important aspect possibly conditioning the outcome of sepsis that is the release of ATP by bacteria. The authors show that sepsis-associated bacteria do in fact release ATP in a growth dependent and strain-specific manner. However, whether this bacterial derived ATP play a role in the pathogenesis of abdominal sepsis has not been determined. To address this question, a number of mutant strains of E. coli has been used first to correlate bacterial ATP release with growth and then, with outer membrane integrity and bacterial death. By using E. coli transformants expressing the ATP-degrading enzyme apyrase in the periplasmic space, the paper nicely shows that abdominal sepsis by these transformants results in significantly improved survival. This effect was associated to the reduction of small peritoneal macrophages and CX3CR1+ monocytes, and increase in neutrophils. To extrapolate the function of bacterial ATP from the systemic response to microorganisms, the authors exploited bacterial OMVs either loaded or not with ATP to investigate the systemic effects devoid of living microorganisms. This approach showed that ATP-loaded OMVs induced degranulation of neutrophils after lysosomal uptake, suggesting this mechanism could contribute to sepsis severity.

      Strengths:

      The most compelling part of the study is the analysis of E. coli mutants to address different aspects of bacterial release of ATP that could be pathogenically relevant during systemic dissemination of bacteria in the host.

      Weaknesses:

      As pointed out in the limitations of the study whether ATP-loaded OMVs could provide a mechanistic proof of the pathogenetic role of bacteria-derived ATP independently of live microorganisms in sepsis is interesting but not definitively convincing. It could be useful to see whether degranulation of neutrophils is differently induced also by apyrase-expressing vs control E. coli transformants. Also, the increase of neutrophils in bacterial ATP-depleted abdominal sepsis, which has better outcome than "ATP-proficient" sepsis, seems difficult to correlate to the hypothesized tissue damage induced by ATP delivered via non-infectious OMVs. Is neutrophils count affected by ATP delivered via OMVs? Probably a comparison of cytokine profiles in the abdominal fluids of E. coli and OMV treated animals could be helpful in defining the different responses induced by OMV-delivered vs bacterial-released ATP.

      The analyses performed on OMV treated versus E. coli infected mice are not immediately related and difficult to combine when trying to draw a pathogenetic hypothesis for bacterial ATP in sepsis.

      It's not clear why lung neutrophils were used for RNAseq.

    1. eLife assessment

      This useful modeling study explores how the biophysical properties of interneuron subtypes in the basolateral amygdala enable them to produce nested oscillations whose interactions facilitate functions such as spike-timing-dependent plasticity. The strength of evidence is currently viewed as incomplete because of insufficient grounding in prior experimental results and insufficient consideration of alternative explanations. This work will be of interest to investigators studying circuit mechanisms of fear conditioning as well as rhythms in the basolateral amygdala. (The authors explain why they disagree with this assessment in their author response.)

    1. Author response:

      Reviewer #1 (Public Review):

      This study excellently complements the previous one by unveiling the properties of NPRL2 in augmenting the effect of immune checkpoint inhibitors such as pembrolizumab in KRAS mutant lung cancer models.

      The following points should be clarified:

      (1) In KRAS mutant cell lines with LKB1 co-mutations or deletions, such as A549 cells, does treatment with NPRL2 not increase the efficacy of immunotherapy? Is this correct? Similarly, does the delivery of NPRL2 only potentiate the effect of immunotherapy in KRAS mutant cell lines without associated LKB1 mutations?

      NPRL2, when used as a single-agent immunotherapy, induces robust antitumor activity in immunotherapy-resistant (aPD1R) KRAS mutant models, such as A549 tumors (KRASmt/LKB1mt/aPD1R) and LLC2 (KRASmt/aPD1R), where immunotherapy is ineffective regardless of LKB1 co-mutation or deletion status. The antitumor effect of NPRL2 combined with aPD1 immunotherapy was not significantly different from NPRL2 alone in immunotherapy-resistant models but was significantly greater than immunotherapy alone. However, a synergistic antitumor effect was observed with NPRL2 and aPD1 immunotherapy in KRAS wild-type and immunotherapy-moderately-responsive models, such as H1299 (KRASwt/aPD1S).

      (2) Do the authors analyze by western blot if NPRL2 influences or restores STING and LKB1 in the A549 cell line that lacks LKB1 and STING?

      NPRL2 induces antitumor immunity on Kras mutant, aPD1 resistant models regardless of LKB1 co-mutations or deletions, however, it would be interesting to look into the effect of NPRL2 on the STING pathway in this LKB1 deleted A549 cell line.

      (3) Mechanistically, is there any explanation as to why NPRL2 delivery increases the efficacy of immunotherapy? Is there any effect on FUS or MYC?

      NPRL2 is a multifunctional tumor suppressor gene that is downregulated or absent in many cancers. NPRL2 has been shown to induce apoptosis, inhibit cell proliferation, and cause cell cycle arrest in various cancer types. Compelling evidence highlights the critical role of NPRL2 in causing DNA damage and double-strand breaks, which can trigger dendritic cell (DC) activation, antigen presentation, and priming of tumor-specific CD8+ T cells in the tumor microenvironment (TME). Our data indicate that NPRL2 treatment is associated with the induction of DC activation and maturation.

      The cellular mechanism of NPRL2 suggests that NPRL2-mediated antitumor immunity depends on the presence of CD4+ T cells, CD8+ T cells, and macrophages. Interestingly, the expression of FUS1, another tumor suppressor gene, was mostly absent or severely downregulated in most non-small cell lung cancers (NSCLC) and was unaffected by NPRL2 treatment. While MYC expression was not assessed in this study, it remains an area of interest for future research.

      (4) Is there any way to carry out a clinical study of systematically delivering NPRL2 in KRAS lung cancer patients?

      In this preclinical study, a clinical-grade DOTAP-NPRL2 formulation was prepared, utilizing NPRL2 encapsulated within nanovesicles for delivery. Based on the promising preclinical data, a phase I clinical trial will be initiated to evaluate the safety and efficacy of this formulation.

      Reviewer #2 (Public Review):

      Summary:

      NPRL2 gene therapy induces effective antitumor immunity in KRAS/STK11 mutant anti-PD1 resistant metastatic non-small cell lung cancer (NSCLC) in a humanized mouse model by Meraz et al investigated the antitumor immune responses to NPRL2 gene therapy in aPD1R / KRAS/STK11mt NSCLC in a humanized mouse model, and found that NPRL2 gene therapy induces antitumor activity on KRAS/STK11mt/aPD1R tumors through DC-mediated antigen presentation and cytotoxic immune cell activation.

      Strengths:

      The novelty of the study.

      Weaknesses:

      (1) The inconsistent effect of NPRL2 combined with pembrolizumab. Figure 2I-K, showed a similar tumor intensity in the NPRL2 group and combination group. However, NPRL2 combined with pembrolizumab was synergistic in the KRASwt/aPD1S H1299 tumors in Figure 4.

      NPRL2, as a single agent immunogen therapy, induces robust antitumor activity on both immunotherapy-resistant (aPD1R) KRAS mutant models, such as A549 tumors (KRASmt/LKB1mt/aPD1R) and LLC2 (KRASmt/aPD1R) and immunotherapy sensitive model such as H1299 (KRASwt/aPD1S) where immunotherapy was ineffective or limitedly effective. A synergistic antitumor effect of NPRL2 and Pembrolizumab combination was found only in immunotherapy moderately responsive models, not in immunotherapy resistant models where PD-1/PD-L1 signaling is impaired shown in Figure 1A.

      (2) The authors stated that NPRL2 combined with pembrolizumab was not synergistic in the KRAS/STK11mt/aPD1R tumors but was synergistic in the KRASwt/aPD1S H1299 tumors. How did the synergistic effect defined in the study, more details need to be provided here.

      Our biostatistician used generalized linear regression models to study the tumor growth over time. Two-way ANOVA with the interaction of treatment group and time point was performed to compare the difference of tumor intensity changes from baseline between each pair of the treatment groups at each time point. The nonparametric Mann-Whitney U test was applied to compare significance in different treatment groups. Differences of P < 0.05, P < 0.01, and P < 0.001 were considered statistically significant. When the combination antitumor effect of NPRL2 and pembrolizumab was found to be statistically significant compared to both single-agent effects synergy was confirmed using the method of Huang et al.

      Huang L, Wang J, Fang B, Meric-Bernstam F, Roth JA, Ha MJ. CombPDX: a unified statistical framework for evaluating drug synergism in patient-derived xenografts. Sci Rep 12(1):12984, 7/2022. e-Pub 7/2022. PMCID: PMC9338066.

      (3) Nearly all of the work was performed pre-clinically. Validation in the clinical setting would provide more strong evidence for the conclusion.

      In this preclinical study, a clinical-grade DOTAP-NPRL2 formulation was prepared, utilizing NPRL2 encapsulated within nanovesicles for delivery. Based on the promising preclinical data, a phase I clinical trial will be initiated to evaluate the safety and efficacy of this formulation.

      (4) Figure 5 and Figure 6 have the same legend. These 2 figures could be merged as a new one.

      Agreed.

      (5) Figure 5B & C, n=9 in the Figure 5B. However, the detail number in Figure 5C was less than 9.

      At least n=7-9 mice/group are shown in the figure 5C. We will revise accordingly.

      Reviewer #3 (Public Review):

      Summary:

      NPRL2/TUSC4 is a tumor suppressor gene whose expression is reduced in many cancers including NSCLC. This study presents a novel finding on NPRL2 gene therapy, which induces antitumor activity on aPD1-resistant tumors. Since KRAS/STK11 mutant tumors were reported to be less benefited from ICIs, this study has potential clinical application value.

      Strengths:

      This work uncovers the advantage of NPRL2 gene therapy by using humanized models and multiple cell lines. Moreover, via immune cell depletion studies, the mechanism of NPRL2 gene therapy has focused on dendritic cells and CD8+T cells.

      Weaknesses:

      A major concern would be the lack of systematic, and logical rigor. This work did not present a link between apoptosis and antigen presenting induced by NPRL2 restoration. There is no evidence proving that the PI3K/AKT/mTOR signaling pathway is related to antigen presenting, which is the major reason of NPRL2 induced antitumor response. Therefore, the two parts may not support each other logically.

      Thank you for your review and comments. We agree that future studies are necessary to establish a direct link between apoptosis and antigen presentation induced by NPRL2 restoration, as well as NPRL2-mediated downregulation of PI3K/AKT/mTOR signaling and its direct effect on antigen presentation. Although NPRL2 restoration directly induced apoptosis in several cell lines shown in Figure 1C and Figure 8Q and significantly increased the number of antigen-presenting DC cells in the tumor microenvironment upon NPRL2 treatment or NPRL2 restoration. Similarly, NPRL2 restoration downregulated the PI3K/AKT/mTOR pathway, which was associated with increased antitumor immunity.

    1. Author response:

      The following is the authors’ response to the current reviews.

      Gating of Kv10 channels is unique because it involves coupling between non-domain swapped voltage sensing domains, a domain-swapped cytoplasmic ring assembly formed by the N- and C-termini, and the pore domain. Recent structural data suggests that activation of the voltage sensing domain relieves a steric hindrance to pore opening, but the contribution of the cytoplasmic domain to gating is still not well understood. This aspect is of particular importance because proteins like calmodulin interact with the cytoplasmic domain to regulate channel activity. The effects of calmodulin (CaM) in WT and mutant channels with disrupted cytoplasmic gating ring assemblies are contradictory, resulting in inhibition or activation, respectively. The underlying mechanism for these discrepancies is not understood. In the present manuscript, Reham Abdelaziz and collaborators use electrophysiology, biochemistry and mathematical modeling to describe how mutations and deletions that disrupt inter-subunit interactions at the cytoplasmic gating ring assembly affect Kv10.1 channel gating and modulation by CaM. In the revised manuscript, additional information is provided to allow readers to identify within the Kv10.1 channel structure the location of E600R, one of the key channel mutants analyzed in this study. However, the mechanistic role of the cytoplasmic domains that this study focuses on, as well as the location of the ΔPASCap deletion and other perturbations investigated in the study remain difficult to visualize without additional graphical information. This can make it challenging for readers to connect the findings presented in the study with a structural mechanism of channel function.

      The authors focused mainly on two structural perturbations that disrupt interactions within the cytoplasmic domain, the E600R mutant and the ΔPASCap deletion. By expressing mutants in oocytes and recording currents using Two Electrode Voltage-Clamp (TEV), it is found that both ΔPASCap and E600R mutants have biphasic conductance-voltage (G-V) relations and exhibit activation and deactivation kinetics with multiple voltage-dependent components. Importantly, the mutant-specific component in the G-V relations is observed at negative voltages where WT channels remain closed. The authors argue that the biphasic behavior in the G-V relations is unlikely to result from two different populations of channels in the oocytes, because they found that the relative amplitude between the two components in the G-V relations was highly reproducible across individual oocytes that otherwise tend to show high variability in expression levels. Instead, the G-V relations for all mutant channels could be well described by an equation that considers two open states O1 and O2, and a transition between them; O1 appeared to be unaffected by any of the structural manipulations tested (i.e. E600R, ΔPASCap, and other deletions) whereas the parameters for O2 and the transition between the two open states were different between constructs. The O1 state is not observed in WT channels and is hypothesized to be associated with voltage sensor activation. O2 represents the open state that is normally observed in WT channels and is speculated to be associated with conformational changes within the cytoplasmic gating ring that follow voltage sensor activation, which could explain why the mutations and deletions disrupting cytoplasmic interactions affect primarily O2. 

      Severing the covalent link between the voltage sensor and pore reduced O1 occupancy in one of the deletion constructs. Although this observation is consistent with the hypothesis that voltage-sensor activation drives entry into O1, this result is not conclusive. Structural as well as functional data has established that the coupling of the voltage sensor and pore does not entirely rely on the S4-S5 covalent linker between the sensor and the pore, and thus the severed construct could still retain coupling through other mechanisms, which is consistent with the prominent voltage dependence that is observed. If both states O1 and O2 require voltage sensor activation, it is unclear why the severed construct would affect state O1 primarily, as suggested in the manuscript, as opposed to decreasing occupancy of both open states. In line with this argument, the presence of Mg2+ in the extracellular solution affected both O1 and O2. This finding suggests that entry into both O1 and O2 requires voltage-sensor activation because Mg2+ ions are known to stabilize the voltage sensor in its most deactivated conformations. 

      We agree with the reviewer that access to both states requires a conformational change in the voltage sensor. This was stated in our revised article: “In contrast, to enter O2, all subunits must complete both voltage sensor transitions and the collective gating ring transition.” We interpret the two gating steps as sequential; the effective rotation of the intracellular ring would happen only once the sensor is in its fully activated position.

      We also agree that the S4-S5 segment cannot be the only interaction mechanism, as we demonstrated in our earlier work (Lörinczi et al., 2015; Tomczak et al., 2017).  

      Activation towards and closure from O1 is slow, whereas channels close rapidly from O2. A rapid alternating pulse protocol was used to take advantage of the difference in activation and deactivation kinetics between the two open components in the mutants and thus drive an increasing number of channels towards state O1. Currents activated by the alternating protocol reached larger amplitudes than those elicited by a long depolarization to the same voltage. This finding is interpreted as an indication that O1 has a larger macroscopic conductance than O2. In the revised manuscript, the authors performed single-channel recordings to determine why O1 and O2 have different macroscopic conductance. The results show that at voltages where the state O1 predominates, channels exhibited longer open times and overall higher open probability, whereas at more depolarized voltages where occupancy of O2 increases, channels exhibited more flickery gating behavior and decreased open probability. These results are informative but not conclusive because additional details about how experiments were conducted, and group data analysis are missing. Importantly, results showing inhibition of single ΔPASCap channels by a Kv10-specific inhibitor are mentioned but not shown or quantitated - these data are essential to establish that the new O1 conductance indeed represents Kv10 channel activity.

      We observed the activity of a channel compatible with Kv10.1 ΔPAS-Cap (long openings at low-moderate potentials, very short flickery activity at strong depolarizations) in 12 patches from oocytes obtained from different frog operations over a period of two and a half months once the experimental conditions could be established. As stated in the text, we did not proceed to generate amplitude histograms because we could not resolve clear single-channel events at strong depolarizations. Astemizole abolished the activity and (remarkably) strongly reduced the noise in traces at strong depolarizations, which we interpret as partially caused by flicker openings.

      Author response image 1.

      We include two example recordings of Astemizole application (100µM) on two different patches. Both recordings are performed at -60 mV (to decrease the likelihood that the channel visits O2) with 100 mM internal and 60 mM external K+. In both cases, the traces in Astemizole are presented in red.

      It is shown that conditioning pulses to very negative voltages result in mutant channel currents that are larger and activate more slowly than those elicited at the same voltage but starting from less negative conditioning pulses. In voltage-activated curves, O1 occupancy is shown to be favored by increasingly negative conditioning voltages. This is interpreted as indicating that O1 is primarily accessed from deeply closed states in which voltage sensors are in their most deactivated position. Consistently, a mutation that destabilizes these deactivated states is shown to largely suppress the first component in voltage-activation curves for both ΔPASCap and E600R channels.

      The authors then address the role of the hidden O1 state in channel regulation by calmodulation. Stimulating calcium entry into oocytes with ionomycin and thapsigarging, assumed to enhance CaM-dependent modulation, resulted in preferential potentiation of the first component in ΔPASCap and E600R channels. This potentiation was attenuated by including an additional mutation that disfavors deeply closed states. Together, these results are interpreted as an indication that calcium-CaM preferentially stabilizes deeply closed states from which O1 can be readily accessed in mutant channels, thus favoring current activation. In WT channels lacking a conducting O1 state, CaM stabilizes deeply closed states and is therefore inhibitory. It is found that the potentiation of ΔPASCap and E600R by CaM is more strongly attenuated by mutations in the channel that are assumed to disrupt interaction with the C-terminal lobe of CaM than mutations assumed to affect interaction with the N-terminal lobe. These results are intriguing but difficult to interpret in mechanistic terms. The strong effect that calcium-CaM had on the occupancy of the O1 state in the mutants raises the possibility that O1 can be only observed in channels that are constitutively associated with CaM. To address this, a biochemical pull-down assay was carried out to establish that only a small fraction of channels are associated with CaM under baseline conditions. These CaM experiments are potentially very interesting and could have wide physiological relevance. However, the approach utilized to activate CaM is indirect and could result in additional nonspecific effects on the oocytes that could affect the results.

      Finally, a mathematical model is proposed consisting of two layers involving two activation steps for the voltage sensor, and one conformational change in the cytoplasmic gating ring - completion of both sets of conformational changes is required to access state O2, but accessing state O1 only requires completion of the first voltage-sensor activation step in the four subunits. The model qualitatively reproduces most major findings on the mutants. Although the model used is highly symmetric and appears simple, the mathematical form used for the rate constants in the model adds a layer of complexity to the model that makes mechanistic interpretations difficult. In addition, many transitions that from a mechanistic standpoint should not depend on voltage were assigned a voltage dependence in the model. These limitations diminish the overall usefulness of the model which is prominently presented in the manuscript. The most important mechanistic assumptions in the model are not addressed experimentally, such as the proposition that entry into O1 depends on the opening of the transmembrane pore gate, whereas entry into O2 involves gating ring transitions - it is unclear why O2 would require further gating ring transitions to conduct ions given that the gating ring can already support permeation by O1 without any additional conformational changes.

      In essence, we agree with the reviewer; we already have addressed these points in our revised article:

      Regarding the voltage dependence we write “the κ/λ transition could reasonably be expected to be voltage independent because we related it to ring reconfiguration, a process that should occur as a consequence of a prior VSD transition. We have made some attempts to treat this transition as voltage independent but state-specific with upper-layer bias for states on the right and lower-layer bias for states on the left. This is in principle possible, as can already be gleaned from the similar voltage ranges of the left-right transition (α/β) and the κL/λ transition. However, this approach leads to a much larger number of free, less well constrained kinetic parameters and drastically complicated the parameter search. ” As you can see, we also formulated a strategy to free the model of the potentially spurious voltage dependence and (in bold here) explained why we did not follow this route in this study. 

      Regarding the need for gating ring transitions after O1, we wrote, “Thus, the underlying gating events can be separated into two steps: The first gating step involves only the voltage sensor without engaging the ring and leads to a pre-open state, which is non-conducting in the WT but conducting in our mutants. The second gating event operates at higher depolarizations, involves a change in the ring, and leads to an open state both in WT and in the mutants. ” 

      We interpret your statements such that you expect the conducting state to remain available once O1 is reached. However, the experimental evidence speaks against that the pore availability remains regardless of the further gating steps beyond O1. The description of model construction is informative here: “... we could exclude many possible [sites at which O1 connects to closed states] because the attachment site must be sufficiently far away from the conventional open state [O2]. Otherwise, the transition from "O1 preferred" to "O2 preferred" via a few closed intermediate states is very gradual and never produces the biphasic GV curves [that we observed]. ” 

      In other words, voltage-dependent gating steps beyond the state that offers access to O1 appear to close the pore, after it was open. That might occur because only then (for states in which at least one voltage sensor exceeded the intermediate position) the ring is fixed in a particular state until all sensors completed activation. In the WT, closing the pore in deactivated states might rely on an interaction that is absent in the mutant because, at least in HERG: “the interaction between the PAS domain and the C-terminus is more stable in closed than in open KV11.1 (HERG) channels, and a single chain antibody binding to the interface between PAS domain and CNBHD can access its epitope in open but not in closed channels, strongly supporting a change in conformation of the ring during gating ”

      Reviewer #3 (Public Review):

      In the present manuscript, Abdelaziz and colleagues interrogate the gating mechanisms of Kv10.1, an important voltage-gated K+ channel in cell cycle and cancer physiology. At the molecular level, Kv10.1 is regulated by voltage and Ca-CaM. Structures solved using CryoEM for Kv10.1 as well as other members of the KCNH family (Kv11 and Kv12) show channels that do not contain a structured S4-S5 linker imposing therefore a non-domain swapped architecture in the transmembrane region. However, the cytoplasmatic N- and C- terminal domains interact in a domain swapped manner forming a gating ring. The N-terminal domain (PAS domain) of one subunit is located close to the intracellular side of the voltage sensor domain and interacts with the C-terminal domain (CNBHD domain) of the neighbor subunit. Mutations in the intracellular domains has a profound effect in the channel gating. The complex network of interactions between the voltage-sensor and the intracellular domains makes the PAS domain a particularly interesting domain of the channel to study as responsible for the coupling between the voltage sensor domains and the intracellular gating ring.

      The coupling between the voltage-sensor domain and the gating ring is not fully understood and the authors aim to shed light into the details of this mechanism. In order to do that, they use well established techniques such as site-directed mutagenesis, electrophysiology, biochemistry and mathematical modeling. In the present work, the authors propose a two open state model that arises from functional experiments after introducing a deletion on the PAS domain (ΔPAS Cap) or a point mutation (E600R) in the CNBHD domain. The authors measure a bi-phasic G-V curve with these mutations and assign each phase as two different open states, one of them not visible on the WT and only unveiled after introducing the mutations.

      The hypothesis proposed by the authors could change the current paradigm in the current understanding for Kv10.1 and it is quite extraordinary; therefore, it requires extraordinary evidence to support it.

      STRENGTHS: The authors use adequate techniques such as electrophysiology and sitedirected mutagenesis to address the gating changes introduced by the molecular manipulations. They also use appropriate mathematical modeling to build a Markov model and identify the mechanism behind the gating changes.

      WEAKNESSES: The results presented by the authors do not fully support their conclusions since they could have alternative explanations. The authors base their primary hypothesis on the bi-phasic behavior of a calculated G-V curve that do not match the tail behavior, the experimental conditions used in the present manuscript introduce uncertainties, weakening their conclusions and complicating the interpretation of the results. Therefore, their experimental conditions need to be revisited. 

      We respectfully disagree. We think that your suggestions for alternative explanations are addressed in the current version of the article. We will rebut them once more below, but we feel the need to point out that our arguments are already laid out in the revised article.

      I have some concerns related to the following points:

      (1) Biphasic gating behavior

      The authors use the TEVC technique in oocytes extracted surgically from Xenopus Leavis frogs. The method is well established and is adequate to address ion channel behavior. The experiments are performed in chloride-based solutions which present a handicap when measuring outward rectifying currents at very depolarizing potentials due to the presence of calcium activated chloride channel expressed endogenously in the oocytes; these channels will open and rectify chloride intracellularly adding to the outward rectifying traces during the test pulse. The authors calculate their G-V curves from the test pulse steady-state current instead of using the tail currents. The conductance measurements are normally taken from the 'tail current' because tails are measured at a fix voltage hence maintaining the driving force constant. 

      We respectfully disagree. In contrast to other channels, like HERG, a common practice for Kv10 is not to use tail currents. It is long known that in this channel, tail currents and test-pulse steady-state currents can appear to be at odds because the channels deactivate extremely rapidly, at the border of temporal resolution of the measurements and with intricate waveforms. This complicates the estimation of the instantaneous tail current. Therefore, the outward current is commonly used to estimate conductance (Terlau et al., 1996; Schönherr et al., 1999; Schönherr et al., 2002; Whicher and MacKinnon, 2019), while the latter authors also use the extreme of the tail for some mutants.

      Due to their activation at very negative voltage, the reversal potential in our mutants can be measured directly; we are, therefore, more confident with this approach. Nevertheless, we have determined the initial tail current in some experiments. The behavior of these is very similar to the average that we present in Figure 1. The biphasic behavior is unequivocally present.

      Author response image 2.

      Calculating the conductance from the traces should not be a problem, however, in the present manuscript, the traces and the tail currents do not agree. 

      The referee’s observation is perfectly in line with the long-standing experience of several labs working with KV10: tail current amplitudes in KV10 appear to be out of proportion for the WT open state (O2). Importantly, this is due to the rapid closure, which is not present in O1. As a consequence, the initial amplitude of tail currents from O1 are easier to estimate correctly, and they are much more obvious in the graphs. Taken together, these differences between O1 and O2 explain the misconception the reviewer describes next.

      The tail traces shown in Fig1E do not show an increasing current amplitude in the voltage range from +50mV to +120mV, they seem to have reached a 'saturation state', suggesting that the traces from the test pulse contain an inward chloride current contamination. 

      As stated in the text and indicated in Author response image 3, the tail currents In Figure 1E increase in amplitude between +50 and +120 mV, as can be seen in the examples below from different experiments (+50 is presented in black, +120 in red). As stated above, the increase is not as evident as in traces from other mutants because the predominance of O2 also implies a much faster deactivation.

      Author response image 3. 

      We are aware that Ca2+-activated Cl- currents can represent a problem when interpreting electrophysiological data in oocytes. In fact, we show in Supplement 1 to Figure 8 that this can be the case during the Ca2+-CaM experiments, where the increase in Ca2+ would certainly augment Cl- contribution to the outward current. This is why we performed these experiments in Cl--free solutions. As we show in Figure 8, the biphasic behavior was also present in those experiments. 

      Importantly, Cl- free bath solutions would not correct contamination during the tail, since this would correspond to Cl- exiting the oocyte. Yet, if there would be contamination of the outward currents by Cl-, one would expect it to increase with larger depolarizations as the typical Ca2+activated Cl- current in oocytes does. As the reviewer states, this does not seem to be the case.

      In addition, this second component identified by the authors as a second open state appears after +50mV and seems to never saturate. The normalization to the maximum current level during the test pulse, exaggerates this second component on the calculated G-V curve. 

      We agree that this second component continues to increase; the reviewer brought this up in the first review, and we have already addressed this in our reply and in the discussion of the revised version: “This flicker block might also offer an explanation for a feature of the mutant channels, that is not explained in the current model version: the continued increase in current amplitude, hundreds of milliseconds into a strong depolarization (Supp. 4 to Fig. 9). If the relative stability of O2 and C2 continued to change throughout depolarization, such a current creep-up could be reproduced. However, this would require either the introduction of further layers of On ↔Cn states, or a non-Markovian modification of the model’s time evolution.” With non-Markovian, we mean a Langevin-type diffusive process. 

      It's worth noticing that the ΔPASCap mutant experiments on Fig 5 in Mes based solutions do not show that second component on the G-V.

      For the readers of this conversation, we would like to clarify that the reviewer likely refers to experiments shown in Fig. 5 of the initial submission but shown in Fig. 6 of the revised version (“Hyperpolarization promotes access to a large conductance, slowly activating open state.” Fig. 5 deals with single channels). We agree that these data look different, but this is because the voltage protocols are completely different (compare Fig. 6A (fixed test pulse, varied prepulse) and Fig. 2A (varied test pulse, fixed pre-pulse). Therefore, no biphasic behavior is expected. 

      Because these results are the foundation for their two open state hypotheses, I will strongly suggest the authors to repeat all their Chloride-based experiments in Mes-based solutions to eliminate the undesired chloride contribution to the mutants current and clarify the contribution of the mutations to the Kv10.1 gating.

      In summary, we respectfully disagree with all concerns raised in point (1). Our detailed arguments rebutting them are given above, but there is a more high-level concern about this entire exchange: the referee casts doubt on observations that are not new. Several labs have reported for a group of mutant KCNH channels: non-monotonic voltage dependence of activation (see, e.g., Fig. 6D in Zhao et al., 2017), multi-phasic tail currents (see e.g. Fig. 4A in Whicher and MacKinnon, 2019, in CHO cells where Cl- contamination is not a concern), and activation by high [Ca2+]i (Lörinczi et al., 2016). Our study replicates those observations and hypothesizes that the existence of an additional conducting state can alone explain all previously unexplained observations. We highlight the potency of this hypothesis with a Markov model that qualitatively reproduces all phenomena. We not only factually disagree with the individual points raised, but we also think that they don't touch on the core of our contribution

      (2) Two step gating mechanism.

      The authors interpret the results obtained with the ΔPASCap and the E600R as two step gating mechanisms containing two open states (O1 and O2) and assign them to the voltage sensor movement and gating ring rotation respectively. It is not clear, however how the authors assign the two open states.

      The results show how the first component is conserved amongst mutations; however, the second one is not. The authors attribute the second component, hence the second open state to the movement of the gating ring. This scenario seems unlikely since there is a clear voltagedependence of the second component that will suggest an implication of a voltage-sensing current.

      We do not suggest that the gating ring motion is not voltage dependent. We would like to point out that voltage dependence can be conveyed by voltage sensor coupling to the ring; this is the widely accepted theory of how the ring can be involved. Should the reviewer mean it in a narrow sense, that the model should be constructed such that all voltage-dependent steps occur before and independently of ring reconfiguration and that only then an additional step that reflects the (voltage-independent) reconfiguration solely, we would like to point the reviewer to the article, where we write: “the κ/λ transition could reasonably be expected to be voltage independent because we related it to ring reconfiguration, a process that should occur as a consequence of a prior VSD transition. We have made some attempts to treat this transition as voltage independent but state-specific with upper-layer bias for states on the right and lower-layer bias for states on the left. This is in principle possible, as can already be gleaned from the similar voltage ranges of the left-right transition (α/β) and the κL/λ transition. However, this approach leads to a much larger number of free, less well constrained kinetic parameters and drastically complicated the parameter search. ” As you can see, we also formulated a strategy to free the model from the potentially spurious voltage dependence and (in bold here) explained why we did not follow this route in this study. 

      The split channel experiment is interesting but needs more explanation. I assume the authors expressed the 2 parts of the split channel (1-341 and 342-end), however Tomczak et al showed in 2017 how the split presents a constitutively activated function with inward currents that are not visible here, this point needs clarification.

      As stated in the panel heading, the figure legend, and the main text, we did not use 1-341 and 342-end as done in Tomczak et al. Instead, “we compared the behavior of ∆2-10 and ∆210.L341Split,”. Evidently, the additional deletion (2-10) causes a shift in activation that explains the difference you point out. However, as we do not compare L341Split and ∆210.L341Split but ∆2-10 and ∆2-10.L341Split, our conclusion remains that “As predicted, compared to ∆2-10, ∆2-10.L341Split showed a significant reduction in the first component of the biphasic GV (Fig. 2C, D).” Remarkably, the behavior of the ∆3-9 L341Split described in Whicher and MacKinnon, 2019 (Figure 5) matches that of our ∆2-10 L341Split, which we think reinforces our case.

      Moreover, the authors assume that the mutations introduced uncover a new open state, however the traces presented for the mutations suggest that other explanations are possible. Other gating mechanisms like inactivation from the closed state, can be introduced by the mutations. The traces presented for ΔPASCap but specially E600R present clear 'hooked tails', a direct indicator of a populations of inactive channels during the test pulse that recover from inactivation upon repolarization (Tristani-Firouzi M, Sanguinetti MC. J Physiol. 1998). 

      There is a possibility that we are debating nomenclature here. In response to the suggestion that all our observations could be explained by inactivation, we attempted a disambiguation of terms in the reply and the article. As the argument is brought up again without reference to our clarification attempts, we will try to be more explicit here:

      If, starting from deeply deactivated states, an open state is reached first, and then, following further activation steps, closed states are reached, this might be termed “inactivation”. In such a reading, our model features many inactivated states. The shortest version of such a model is C-O-I. It is for instance used by Raman and Bean (2001; DOI: 10.1016/S00063495(01)76052-3) to explain NaV gating in Purkinje neurons. If “inactivation” is meant in the sense that a gating transition exists, which is orthogonal to an activation/deactivation axis, and that after this orthogonal transition, an open state cannot be reached anymore, then all of the upper floor in our model is inactivated with respect to the open state O1. Finally, the state C2 is an inactivated state to O2. In this view, “inactivation” explains the observed phenomena. 

      However, we must disagree if the referee means that a parsimonious explanation exists in which a single conducting state is the only source for all observed currents.   

      There is a high-level reason: we found a single assumption that explains three different phenomena, while the inactivation hypothesis with one conducting state cannot explain one of them (the increase of the first component under raised CaM). But there is also a low-level reason: the tails in Tristani-Firouzi and Sanguinetti 1998 are fundamentally different from what we report herein in that they lack a third component. Thus, those tails are consistent with recovery from inactivation through a single open state, while a three-component tail is not. In the framework of a Markov model, the time constants of transitions from and to a given state (say O2), cannot change unless the voltage changes. During the tail current, the voltage does not change, yet we observe: 

      i) a rapid decrease with a time constant of at most a few milliseconds (Fig 9 S2, 1-> 2),  ii) a slow increase in current, peaking after approximately 25 milliseconds and iii) a relaxation to zero current with a time constant of >50 ms. 

      According to the reviewer’s suggestion, these processes on three timescales should all be explained by depopulating and repopulating the same open state while all rates are constant. There might well be a complicated multi-level state diagram with a single open state with different variants, like (open and open inactivated) that could produce triphasic tails with these properties if the system had not reached a steady state distribution at the end of the test pulse. It cannot, however, achieve it from an equilibrated system, and certainly, it cannot at the same time produce “biphasic activation” and “activation by CaM”. 

      The results presented by the authors can be alternatively explained with a change in the equilibrium between the close to inactivated/recovery from inactivation to the open state. 

      Again, we disagree. The model construction explains in detail that the transition from the first to the second phase is not gradual. Shifting equilibria cannot reproduce this. We have extensively tested that idea and can exclude this possibility.

      Finally, the authors state that they do not detect "cumulative inactivation after repeated depolarization" but that is considering inactivation only from the open state and ignoring the possibility of the existence of close state inactivation or, that like in hERG, that the channel inactivates faster that what it activates (Smith PL, Yellen G. J Gen Physiol. 2002). 

      We respectfully disagree. We explicitly model an open state that inactivates faster (O2->C2) than it activates. Once more, this is stated in the revised article, which we point to for details. Again, this alternative mechanism does not have the potential to explain all three effects. As discussed above about the chloride contamination concerns, this inactivation hypothesis was mentioned in the first review round and, therefore, addressed in our reply and the revised article. We also explained that “inactivation” has no specific meaning in Markov models. In the absence of O1, all transitions towards the lower layer are effectively “inactivation from closed states”, because they make access to the only remaining open state less likely”. But this is semantics. What is relevant is that no network of states around a single open state can reproduce the three effets in a more parsimonious way than the assumption of the second open state does.

      (3) Single channel conductance.

      The single channels experiments are a great way to assess the different conductance of single channel openings, unfortunately the authors cannot measure accurately different conductances for the two proposed open states. The Markov Model built by the authors, disagrees with their interpretation of the experimental results assigning the exact same conductance to the two modeled open states. To interpret the mutant data, it is needed to add data with the WT for comparison and in presence of specific blockers. 

      We respectfully disagree. As previously shown, the conductance of the flickering wild-type open state is very difficult to resolve. Our recordings do not show that the two states have different single-channel conductances, and therefore the model assumes identical singlechannel conductance. 

      The important point is that the single-channel recordings clearly show two different gating modes associated with the voltage ranges in which we predict the two open states. One has a smaller macroscopic current due to rapid flickering (aka “inactivation”). These recordings are another proof of the existence of two open states because the two gating modes occur.  Wild-type data can be found in Bauer and Schwarz, (2001, doi:10.1007/s00232-001-0031-3) or Pardo et al., (1998, doi:10.1083/jcb.143.3.767) for comparison.

      We appreciate the effort editors and reviewers invested in assessing the revised manuscript. Yet, we think that the demanded revision of experimental conditions and quantification methods contradicts the commonly accepted practice for KV10 channels. Some of the reviewer comments are skeptical about the biphasic behavior, which is an established and replicated finding for many mutants and by many researchers. The alternative explanations for these disbelieved findings are either “semantics” or cannot quantitatively explain the measurements. Therefore, only the demand for more explanations and unprecedented resolution in singlechannel recordings remains. We share these sentiments.

      ———— The following is the authors’ response to the original reviews.

      (1) The authors must show that the second open state is not just an artifact of endogenous activity but represents the activity of the same EAG channels. I suggest that the authors repeat these experiments in Mes-based solutions. 

      (2) Along the same lines, it is necessary to show that these currents can be blocked using known EAG channel blockers such as astemizole. Ultimately, it will be important to demonstrate using single-channel analysis that these do represent two distinct open states separated by a closed state. 

      We have addressed these concerns using several approaches. The most substantial change is the addition of single-channel recordings on ΔPASCap. In those experiments, we could provide evidence of the two types of events in the same patch, and the presence of an outward current at -60 mV, 50 mV below the equilibrium potential for chloride. The channels were never detected in uninjected oocytes, and Astemizole silenced the activity in patches containing multiple channels. These observations, together with the maintenance of the biphasic behavior that we interpret as evidence of the presence of O1 in methanesulfonate-based solutions, strongly suggest that both O1 and O2 obey the expression of KV10.1 mutants.

      (3) Currents should be measured by increasing the pulse lengths as needed in order to obtain the true steady-state G-V curves. 

      We agree that the endpoint of activation is ill-defined in the cases where a steady-state is not reached. This does indeed hamper quantitative statements about the relative amplitude of the two components. However, while the overall shape does change, its position (voltage dependence) would not be affected by this shortcoming. The data, therefore, supports the claim of the “existence of mutant-specific O1 and its equal voltage dependence across mutants.”

      (4) A more clear and thorough description should be provided for how the observations with the mutant channels apply to the behavior of WT channels. How exactly does state O1 relate to WT behavior, and how exactly do the parameters of the mathematical model differ between WT and mutants? How can this be interpreted at a structural level? What could be the structural mechanism through which ΔPASCap and E600R enable conduction through O1? It seems contradictory that O1 would be associated exclusively with voltage-sensor activation and not gating ring transitions, and yet the mutations that enable cation access through O1 localize at the gating ring - this needs to be better clarified. 

      We have undertaken a thorough rewriting of all sections to clarify the structural correlates that may explain the behavior of the mutants. In brief, we propose that when all four voltage sensors move towards the extracellular side, the intracellular ring maintains the permeation path closed until it rotates. If the ring is altered, this “lock” is incompetent, and permeation can be detected (page 34). By fixing the position of the ring, calmodulin would preclude permeation in the WT and promote the population of O1 in the mutants.

      (5) Rather than the t80% risetime, exponential fits should be performed to assess the kinetics of activation. 

      We agree that the assessment of kinetics by a t80% is not ideal. We originally refrained from exponential fits because they introduce other issues when used for processes that are not truly exponential (as is the case here). We had planned to perform exponential fits in this revised version, but because the activation process is not exponential, the time constants we could provide would not be accurate, and the result would remain qualitative as it is now. In the experiments where we did perform the fits (Fig. 3), the values obtained support the statement made. 

      (6) It is argued based on the G-V relations in Figure 2A that none of the mutations or deletions introduced have a major effect on state O1 properties, but rather affect state O2. However, the occupancy of state O2 is undetermined because activation curves do not reach saturation. It would be interesting to explore the fitting parameters on Fig.2B further to test whether the data on Fig 2A can indeed only be described by fits in which the parameters for O1 remain unchanged between constructs. 

      We agree that the absolute occupancy of O2 cannot be properly determined if a steady state is not reached. This is, however, a feature of the channel. During very long depolarizations in WT, the current visually appears to reach a plateau, but a closer look reveals that the current keeps increasing after very long depolarizations (up to 10 seconds; see, e.g., Fig. 1B in Garg et al., 2013, Mol Pharmacol 83, 805-813. DOI: 10.1124/mol.112.084384). Interestingly, although the model presented here does not account for this behavior, we propose changes in the model that could. “If the relative stability of O2 and C2 continued to change throughout the depolarization such a current creep-up could be reproduced. However, this would require either the introduction of further layers of On↔Cn states or a non-Markovian modification of the model’s evolution.” Page 34.

      (7) The authors interpret the results obtained with the mutants DPASCAP and E600R -tested before by Lorinczi et al. 2016, to disrupt the interactions between the PASCap and cNBHD domains- as a two-step gating mechanism with two open states. All the results obtained with the E600R mutant and DPASCap could also be explained by inactivation/recovery from inactivation behavior and a change in the equilibrium between the closed states closed/inactivated states and open states. Moreover, the small tails between +90 to +120 mV suggest channels accumulate in an inactive state (Fig 1E). It is not convincing that the two open-state model is the mechanism underlying the mutant's behavior.  

      We respectfully disagree with the notion that a single open state can provide a plausible explanation for "All the results obtained with the E600R mutant and DPASCap". We think that our new single channel results settle the question, but even without this direct evidence, a quantitative assessment of the triphasic tail currents all but excludes the possibility of a single open state. We agree that it is, in principle, possible to obtain some form of a multiphasic tail with a single open state using the scheme suggested in this comment: at the end of the test pulse, a large fraction of the channels must be accumulated in inactive states, and a few are in the open state. The hyperpolarization to -100mV then induces a rapid depopulation of the open state, followed by slower replenishments from the inactive state. Exactly this process occurs in our model, when C2 empties through O2 (Supp. 5 to Fig 9, E600R model variant). However, this alone is highly unlikely to quantitatively explain the measured tail currents, because of the drastically different time scales of the initial current decay (submillisecond to at most a few milliseconds lifetime) and the much slower transient increase in current (several tens of milliseconds) and the final decay with time constants of >100 ms (see for instance data in Fig. 1 E for E600R +50 to +120mV test pulse). To sustain the substantial magnitude of slowly decaying current by slow replenishment of an open state with a lifetime of 1 ms requires vast amounts of inactivated channels. A rough estimation based on the current integral of the initial decay and the current integral of the slowly decaying current suggests that at the end of the test pulse, the ratio inactivated/open channels would have to be 500 to 1500 for this mechanism to quantitatively explain the observed tail currents. To put this in perspective: This would suggest that without inactivation all the expressed channels in an oocyte would provide 6 mA current during the +100 mV test pulse. While theoretically possible, we consider this a less likely explanation than a second open state.

      (8) Different models should be evaluated to establish whether the results in Figure 4 can also be explained by a model in which states O1 and O2 have the same conductance. It would be desirable if the conductance of both states were experimentally determined - noise analysis could be applied to estimate the conductance of both states. 

      In the modified model, O1 and O2 have the same single-channel conductance. The small conductance combined with the fast flickering did not allow an accurate determination, but we can state that there is no evidence that the single-channel conductance of the states is different.

      (9) Although not included, it looks like the model predicts some "conventional inactivation" This can be appreciated in Fig 8, and in the traces at -60mV. Interestingly, the traces obtained in the absence of Cl- also undergo slow inactivation, or 'conventional inactivation' as referred to by the authors. Please revise the following statement "Conventional inactivation was never detected in any mutants after repeated or prolonged depolarization. In the absence of inactivation, the pre-pulse dependent current increase at +40 mV could be related to changes in the relative occupancy of the open states". 

      We have carefully edited the manuscript to address this concern. The use of the term inactivation admittedly represents a challenge. We agree that the state that results from the flickering block (C2) could be defined as “inactivated” because it is preceded by an open state. Yet, in that case, the intermediate states that the channel travels between O1 and O2 would also be sensu stricto “inactivated”, but only in the mutants. We have made this clear in page 17.

      Recommendations for improving the writing and presentation.

      (1) Methods section: Please state the reversal potential calculated for the solution used. It looks like the authors used an Instantaneous I-V curve method to calculate the reversal potential; if that's correct, please show the I-V and the traces together with the protocol used. 

      We have provided the calculated reversal potentials for excised patches. We cannot predict the reversal potential in whole oocytes because we have no control over the intracellular solution. The reversal potential was determined in the mutants through the current at the end of the stimulus because the mutants produced measurable inward currents. The differences in reversal potential were not significant among mutants.

      Pulse protocols have been added to the figures.

      (2) Figure 1 suggestion: Combine the two panels in panel D and move the F panel up so the figure gets aligned in the lower end.

      Thank you, this has been done.

      (3) Please clarify the rationale for using the E600R-specific mutant. I assume it is based on the Lorinzci et al. 2016 effect and how this is similar to the DPASCap phenotype, or is it due to the impact of this mutation in the interactions between the N-term and the cNBHD? 

      We have explained the rationale for the use of E600R explicitly on page 6.

      (4) Fig S1A is not present in the current version of the manuscript. Include a cartoon as well as a structural figure clearly depicting the perturbations introduced by E600R, ΔPASCap, and the other deletions that are tested. Additional structural information supporting the discussion would also be helpful to establish clearer mechanistic links between the experimental observations described here and the observed conformational changes between states in Kv10 channel structures. 

      We have corrected this omission, thank you for pointing it out.

      (5) It would be informative to see the traces corresponding to the I-V shown in Fig 7 A and B at the same indicated time points (0, 60, 150, and 300s). Did the authors monitor the Ca2+ signal rise after the I&T treatment to see if it coincides with the peak in the 60s? 

      In Figure 7 (now Figure 8) we used voltage ramps instead of discrete I-V protocols because of the long time required for recording the latter. This is stated on page 19. Ca2+ was monitored through Cl- current after ionomycin/thapsigargin. The duration of the Ca2+ increase was reproducible among oocytes and in good agreement with the changes observed in the biphasic behavior of the mutants (Supplement 1 to Figure 8).

      (6) Fig 4. Please state in the legend what the different color traces correspond to in E600R and DPASCap. Is there a reason to change the interpulse on DPASCap to -20mV and not allow this mutant to close? Please state. How do the authors decide the 10 ms interval for the experiments in Fig 2? 

      Thank you for pointing this out, we have added the description. We have explained why we use a different protocol for ΔPASCap and the reason for using 10 ms interval (we believe the referee means Figure 4) on page 12.  

      (7) Fig. 5. Since the pre-pulse is supposed to be 5s, but the time scale doesn't correspond with a pre-pulse of 5 s before the test pulse to +40mV. Has the pre-pulse been trimmed for representation purposes? If so, please state. 

      The pre-pulse was 5s, but as the reviewer correctly supposed, the trace is trimmed to keep the +40 mV stimulus visible. This has now been clearly stated in the legend.

      (8) The mutant L322H is located within the S4 helix according to the Kv10.1 structure (PDB 5K7L), not in the 'S3-S4 linker'; please correct. 

      This has been done, thank you.

      The introduction of this mutant should also shift the voltage dependence toward more hyperpolarizing potentials (around 30mV, according to Schoenherr et al. 1999). It looks like that shift is present within the first component of the G-V. Still, since the max amplitude from the second component could be contaminated by endogenous Cl- currents, this effect is minimized. Repeating these experiments in the no Cl- solutions will help clarify this point and see the effect of the DPASCap and E600R in the background of a mutation that accelerates the transitions between the closed states (see Major comment 1). Did the authors record L322H alone for control purposes? 

      We have decided not to measure L322H alone or repeat the measurements in Cl--free solutions because we do not see a way to use the quantitative assessment of the voltage dependence of L322H and the L322H-variants of the eag domain mutants. Like in our answer to main point 3, we base our arguments not on the precise voltage dependence of the second component but on the shape of the G-V curves instead, specifically the consistent appearance of the first component and the local conductance minimum between the first and second components. After the introduction of L322H the first component is essentially absent.

      We think that the measurements of the L322H mutants cannot be interpreted as a hyperpolarizing shift in the first component. The peak of the first conductance component occurs around -20 mV in ΔPASCap and E600R (Fig. 7 C, D). After a -30mV shift, in L322H+DPASCap and L322H+E600R, this first peak would still be detected within the voltage range in our experiments, but it is not. A contamination of the second component would have little impact on this observation, which is why we refrain from the suggested measurements.  

      (9) The authors differentiate between an O1 vs. O2 state with different conductances, and maybe I missed it, but there's no quantitative distinction between the components; how are they different?

      Please see the response to the main comments 1 and 2. This has been addressed in singlechannel recordings.

      (10) Please state the voltage protocols, holding voltages, and the solutions (K+ concentration and Cl-presence/absence) used for the experiments presented in the legends on the figures. Hence, it's easier to interpret the experiments presented. 

      Thank you, this has been done.

      (11) The authors state on page 7 that "with further depolarizations, the conductance initially declined to rise again in response to strong depolarizations. This finding matches the changes in amplitude of the tail currents, which, therefore, probably reflect a true change in conductance" However, the tails in the strong voltage range (+50 to +120 mV) for the E600R mutant argue against this result. Please review.

      The increase in the amplitude of the tail current is also present in E600R, but the relative increase is smaller. We have decided against rescaling these traces because the Figure is already rather complex. We indicated this fact with a smaller arrow and clarified it in the text (page 8).

      (12) The authors mention that the threshold of activation for the WT is around -20mV; however, the foot of the G-V is more around -30 or -40mV. Please revise. 

      Thank you. We have done this. 

      (13) The authors state on page 9 that the 'second component occurs at progressively more depolarized potentials for increasingly larger N-terminal deletions" However E600R mutant that conserves the N-terminal intact has a shift as pronounced as the DPASCap and larger than the D2-10. How do the authors interpret this result? 

      We have corrected this statement in page 10 : “…the second component occurs at progressively more depolarized potentials for increasingly larger N-terminal deletions and when the structure of the ring is altered through disruption of the interaction between N- and C-termini (E600R)”.

      (14) The equation defined to fit the G-Vs, can also be used to describe the WT currents. If the O1 is conserved and present in the WT, this equation should also fit the WT data properly. The 1-W component shown could also be interpreted as an inactivating component that, in the WT, shifts the voltage-dependence of activation towards depolarizing potentials and is not visible. Still, the mutants do show it as if the transition from closed-inactivated states is controlled by interactions in the gating ring, and disturbing them does affect the transitions to the open state. 

      Out of the two open states in the mutant, O2 is the one that shares properties with the WT (e.g. it is inaccessible during Ca2+-CaM binding) while O1 is the open state with the voltage dependence that is conserved across the mutants. We, therefore, believe that this question is based on a mix-up of the two open states. We appreciate the core of the question: does the pattern in the mutants’ G-V curves find a continuation in the WT channel? 

      Firstly, the component that is conserved among mutants does not lead to current in the WT because the corresponding open state (O1) is not observed in WT. However, the gating event represented by this component should also occur in WT and –given its apparent insensitivity to eag domain mutations–  this gating step should occur in WT with the same voltage dependence as in all the mutants. This means that this first component sets a hard boundary for the most hyperpolarized G-V curve we can expect in the WT, based on our mutant measurements. Secondly, the second component shows a regular progression across mutants: The more intact the eag domain is, the more hyperpolarized the Vhalf values of transition term (1-W) and O2 activation. In Δ2-10, the transition term already almost coincides with O1 activation (estimated Vhalf values of -33.57 and -33.47 mV). A further shift of (1-W) in the WT is implausible because, if O1 activation is coupled to the earliest VSD displacement, the transition should not occur before O1 activation. Still, the second component might shift to more hyperpolarized values in the WT, depending on the impact of amino acids 2 to 10 on the second VSD transition.

      In summary, in WT the G-V should not be more hyperpolarized than the first component of the mutants, and the (1-W)-component probably corresponds to the Δ2-10 (1-W)-component. In WT the second component should be no more depolarized than the second component of Δ2-10. The WT G-V (Fig.1B) meets all these predictions derived from the pattern in the mutant GVs: When we use Eq. 4 to fit the WT G-V with A1=0 (O1 is not present in WT) and the parameters of the transition term (1-W)  fixed to the values attained in Δ2-10, we obtain a fit for the O2 component with Vhalf\=+21mV. This value nicely falls into the succession of Vhalf values for Δeag, ΔPASCap, and Δ2-10 (+103mV,+80mV,+52mV) and, at the same time, it is not more hyperpolarized than the conserved first component (Vhalf -34mV). Our measurements therefore support that the O2 component in the mutants corresponds to the single open state in the WT. 

      (15) Page 15, the authors state that 'The changes in amplitude and kinetics in response to rising intracellular Ca2+ support our hypothesis that Ca-CaM stabilized O1, possibly by driving the channels to deep closed states (Fig 5 and 6)' (pg 15). This statement seems contradictory; I can't quite follow the rationale since Ca2+ potentiates the current (Fig 7), and the addition of the L322H mutant in Fig 7 makes the shift of the first component to negative potentials visible.

      Please check the rationale for this section. 

      We have explained this more explicitly in the discussion (page 32). “Because access to O1 occurs from deep closed states, this could be explained by an increased occupancy of such deactivated states in response to CaM binding. This appears to be the case since CaM induces a biphasic behavior in the mutant channels that show reduced access to deep closed states; thus, L322H mutants behave like the parental variants in the presence of Ca2+-CaM. This implies a mechanistic explanation for the effect of Ca2+-CaM on WT since favoring entry into deep closed states would result in a decrease in current amplitude in the absence of (a permeable) O1”.

      Also, Figs 5 and 6 seem miscited here. 

      Thank you, we have corrected this.

      (16) For Figure 5, it would be helpful if each of the current traces corresponding to a particular voltage had a different color. That way, it will be easier to see how the initial holding voltage modulates current. 

      We have considered this suggestion, and we agree that it would make it easier to follow. Yet, since we have identified the mutants with different colors, it would be inconsistent if we used another color palette for this Figure. Supplement 3 to Figure 9 shows the differences in a clearer way.

      (17) Add zero-current levels to all current traces.

      We have done this.

      (18) The mathematical model should be described better. Particularly, the states from which O1 can be accessed should be described more clearly, as well as whether the model considers any direct connectivity between states O1 and O2. The origin of the voltage-dependence for transitions that do not involve voltage-sensor movements should be discussed. Also, it separation of kappa into kappa-l and kappa-r should be described. 

      We have extensively rewritten the description of the mathematical model to address these concerns.

      (19) Page 4, "reveals a pre-open state in which the transmembrane regions of the channel are compatible with ion permeation, but is still a nonconducting state". Also, page 27, "renders a hydrophobic constriction wider than 8 Å, enough to allow K+ flow, but still corresponds to a non-conducting state". These sentences are confusing - how can the regions be compatible with ion permeation, and still not be conducting? Is cation conductance precluded by a change in the filter, or elsewhere? How is it established that it represents a non-conducting state? 

      We have rephrased to clarify this apparent inconsistence. Page 4: “(…) in which the transmembrane regions of the channel are compatible with ion permeation (the permeation path is dilated, like in open states) but the intracellular gate is still in the same conformation as in closed states (Zhang et al., 2023).” Page 31: “The presence of an intact intracellular ring would preclude ionic flow in the WT, and its alteration would explain the permeability of this state in the mutants.”

    2. eLife assessment

      This valuable study examines the role of the interaction between cytoplasmic N- and C-terminal domains in voltage-dependent gating of Kv10.1 channels. The authors claim to have identified a hidden open state in Kv10.1 mutant channels, thus providing a window for observing early conformational transitions associated with channel gating. The evidence supporting the major conclusions is incomplete, however, and additional work is required to determine the molecular mechanism underlying the observations in this study. With the experimental conditions clarified and the mechanistic interpretations addressed, this work could be significant in understanding the gating mechanisms of the KCNH family and will appeal to biophysicists interested in ion channels and physiologists interested in cancer biology.

    3. Reviewer #1 (Public Review):

      Gating of Kv10 channels is unique because it involves coupling between non-domain swapped voltage sensing domains, a domain-swapped cytoplasmic ring assembly formed by the N- and C-termini, and the pore domain. Recent structural data suggests that activation of the voltage sensing domain relieves a steric hindrance to pore opening, but the contribution of the cytoplasmic domain to gating is still not well understood. This aspect is of particular importance because proteins like calmodulin interact with the cytoplasmic domain to regulate channel activity. The effects of calmodulin (CaM) in WT and mutant channels with disrupted cytoplasmic gating ring assemblies are contradictory, resulting in inhibition or activation, respectively. The underlying mechanism for these discrepancies is not understood. In the present manuscript, Reham Abdelaziz and collaborators use electrophysiology, biochemistry and mathematical modeling to describe how mutations and deletions that disrupt inter-subunit interactions at the cytoplasmic gating ring assembly affect Kv10.1 channel gating and modulation by CaM. In the revised manuscript, additional information is provided to allow readers to identify within the Kv10.1 channel structure the location of E600R, one of the key channel mutants analyzed in this study. However, the mechanistic role of the cytoplasmic domains that this study focuses on, as well as the location of the ΔPASCap deletion and other perturbations investigated in the study remain difficult to visualize without additional graphical information. This can make it challenging for readers to connect the findings presented in the study with a structural mechanism of channel function.

      The authors focused mainly on two structural perturbations that disrupt interactions within the cytoplasmic domain, the E600R mutant and the ΔPASCap deletion. By expressing mutants in oocytes and recording currents using Two Electrode Voltage-Clamp (TEV), it is found that both ΔPASCap and E600R mutants have biphasic conductance-voltage (G-V) relations and exhibit activation and deactivation kinetics with multiple voltage-dependent components. Importantly, the mutant-specific component in the G-V relations is observed at negative voltages where WT channels remain closed. The authors argue that the biphasic behavior in the G-V relations is unlikely to result from two different populations of channels in the oocytes, because they found that the relative amplitude between the two components in the G-V relations was highly reproducible across individual oocytes that otherwise tend to show high variability in expression levels. Instead, the G-V relations for all mutant channels could be well described by an equation that considers two open states O1 and O2, and a transition between them; O1 appeared to be unaffected by any of the structural manipulations tested (i.e. E600R, ΔPASCap, and other deletions) whereas the parameters for O2 and the transition between the two open states were different between constructs. The O1 state is not observed in WT channels and is hypothesized to be associated with voltage sensor activation. O2 represents the open state that is normally observed in WT channels and is speculated to be associated with conformational changes within the cytoplasmic gating ring that follow voltage sensor activation, which could explain why the mutations and deletions disrupting cytoplasmic interactions affect primarily O2.

      Severing the covalent link between the voltage sensor and pore reduced O1 occupancy in one of the deletion constructs. Although this observation is consistent with the hypothesis that voltage-sensor activation drives entry into O1, this result is not conclusive. Structural as well as functional data has established that the coupling of the voltage sensor and pore does not entirely rely on the S4-S5 covalent linker between the sensor and the pore, and thus the severed construct could still retain coupling through other mechanisms, which is consistent with the prominent voltage dependence that is observed. If both states O1 and O2 require voltage sensor activation, it is unclear why the severed construct would affect state O1 primarily, as suggested in the manuscript, as opposed to decreasing occupancy of both open states. In line with this argument, the presence of Mg2+ in the extracellular solution affected both O1 and O2. This finding suggests that entry into both O1 and O2 requires voltage-sensor activation because Mg2+ ions are known to stabilize the voltage sensor in its most deactivated conformations.

      Activation towards and closure from O1 is slow, whereas channels close rapidly from O2. A rapid alternating pulse protocol was used to take advantage of the difference in activation and deactivation kinetics between the two open components in the mutants and thus drive an increasing number of channels towards state O1. Currents activated by the alternating protocol reached larger amplitudes than those elicited by a long depolarization to the same voltage. This finding is interpreted as an indication that O1 has a larger macroscopic conductance than O2. In the revised manuscript, the authors performed single-channel recordings to determine why O1 and O2 have different macroscopic conductance. The results show that at voltages where the state O1 predominates, channels exhibited longer open times and overall higher open probability, whereas at more depolarized voltages where occupancy of O2 increases, channels exhibited more flickery gating behavior and decreased open probability. These results are informative but not conclusive because additional details about how experiments were conducted, and group data analysis are missing. Importantly, results showing inhibition of single ΔPASCap channels by a Kv10-specific inhibitor are mentioned but not shown or quantitated - these data are essential to establish that the new O1 conductance indeed represents Kv10 channel activity.

      It is shown that conditioning pulses to very negative voltages result in mutant channel currents that are larger and activate more slowly than those elicited at the same voltage but starting from less negative conditioning pulses. In voltage-activated curves, O1 occupancy is shown to be favored by increasingly negative conditioning voltages. This is interpreted as indicating that O1 is primarily accessed from deeply closed states in which voltage sensors are in their most deactivated position. Consistently, a mutation that destabilizes these deactivated states is shown to largely suppress the first component in voltage-activation curves for both ΔPASCap and E600R channels.

      The authors then address the role of the hidden O1 state in channel regulation by calmodulation. Stimulating calcium entry into oocytes with ionomycin and thapsigarging, assumed to enhance CaM-dependent modulation, resulted in preferential potentiation of the first component in ΔPASCap and E600R channels. This potentiation was attenuated by including an additional mutation that disfavors deeply closed states. Together, these results are interpreted as an indication that calcium-CaM preferentially stabilizes deeply closed states from which O1 can be readily accessed in mutant channels, thus favoring current activation. In WT channels lacking a conducting O1 state, CaM stabilizes deeply closed states and is therefore inhibitory. It is found that the potentiation of ΔPASCap and E600R by CaM is more strongly attenuated by mutations in the channel that are assumed to disrupt interaction with the C-terminal lobe of CaM than mutations assumed to affect interaction with the N-terminal lobe. These results are intriguing but difficult to interpret in mechanistic terms. The strong effect that calcium-CaM had on the occupancy of the O1 state in the mutants raises the possibility that O1 can be only observed in channels that are constitutively associated with CaM. To address this, a biochemical pull-down assay was carried out to establish that only a small fraction of channels are associated with CaM under baseline conditions. These CaM experiments are potentially very interesting and could have wide physiological relevance. However, the approach utilized to activate CaM is indirect and could result in additional non-specific effects on the oocytes that could affect the results.

      Finally, a mathematical model is proposed consisting of two layers involving two activation steps for the voltage sensor, and one conformational change in the cytoplasmic gating ring - completion of both sets of conformational changes is required to access state O2, but accessing state O1 only requires completion of the first voltage-sensor activation step in the four subunits. The model qualitatively reproduces most major findings on the mutants. Although the model used is highly symmetric and appears simple, the mathematical form used for the rate constants in the model adds a layer of complexity to the model that makes mechanistic interpretations difficult. In addition, many transitions that from a mechanistic standpoint should not depend on voltage were assigned a voltage dependence in the model. These limitations diminish the overall usefulness of the model which is prominently presented in the manuscript. The most important mechanistic assumptions in the model are not addressed experimentally, such as the proposition that entry into O1 depends on the opening of the transmembrane pore gate, whereas entry into O2 involves gating ring transitions - it is unclear why O2 would require further gating ring transitions to conduct ions given that the gating ring can already support permeation by O1 without any additional conformational changes.

    4. Reviewer #3 (Public Review):

      In the present manuscript, Abdelaziz and colleagues interrogate the gating mechanisms of Kv10.1, an important voltage-gated K+ channel in cell cycle and cancer physiology. At the molecular level, Kv10.1 is regulated by voltage and Ca-CaM. Structures solved using Cryo-EM for Kv10.1 as well as other members of the KCNH family (Kv11 and Kv12) show channels that do not contain a structured S4-S5 linker imposing therefore a non-domain swapped architecture in the transmembrane region. However, the cytoplasmatic N- and C- terminal domains interact in a domain swapped manner forming a gating ring. The N-terminal domain (PAS domain) of one subunit is located close to the intracellular side of the voltage sensor domain and interacts with the C-terminal domain (CNBHD domain) of the neighbor subunit. Mutations in the intracellular domains has a profound effect in the channel gating. The complex network of interactions between the voltage-sensor and the intracellular domains makes the PAS domain a particularly interesting domain of the channel to study as responsible for the coupling between the voltage sensor domains and the intracellular gating ring.

      The coupling between the voltage-sensor domain and the gating ring is not fully understood and the authors aim to shed light into the details of this mechanism. In order to do that, they use well established techniques such as site-directed mutagenesis, electrophysiology, biochemistry and mathematical modeling. In the present work, the authors propose a two open state model that arises from functional experiments after introducing a deletion on the PAS domain (ΔPAS Cap) or a point mutation (E600R) in the CNBHD domain. The authors measure a bi-phasic G-V curve with these mutations and assign each phase as two different open states, one of them not visible on the WT and only unveiled after introducing the mutations. The hypothesis proposed by the authors could change the current paradigm in the current understanding for Kv10.1 and it is quite extraordinary; therefore, it requires extraordinary evidence to support it.

      STRENGTHS: The authors use adequate techniques such as electrophysiology and site-directed mutagenesis to address the gating changes introduced by the molecular manipulations. They also use appropriate mathematical modeling to build a Markov model and identify the mechanism behind the gating changes.

      WEAKNESSES: The results presented by the authors do not fully support their conclusions since they could have alternative explanations. The authors base their primary hypothesis on the bi-phasic behavior of a calculated G-V curve that do not match the tail behavior, the experimental conditions used in the present manuscript introduce uncertainties, weakening their conclusions and complicating the interpretation of the results. Therefore, their experimental conditions need to be revisited

      I have some concerns related to the following points:

      (1) Biphasic gating behavior<br /> The authors use the TEVC technique in oocytes extracted surgically from Xenopus Leavis frogs. The method is well established and is adequate to address ion channel behavior. The experiments are performed in chloride-based solutions which present a handicap when measuring outward rectifying currents at very depolarizing potentials due to the presence of calcium activated chloride channel expressed endogenously in the oocytes; these channels will open and rectify chloride intracellularly adding to the outward rectifying traces during the test pulse.<br /> The authors calculate their G-V curves from the test pulse steady-state current instead of using the tail currents. The conductance measurements are normally taken from the 'tail current' because tails are measured at a fix voltage hence maintaining the driving force constant. Calculating the conductance from the traces should not be a problem, however, in the present manuscript, the traces and the tail currents do not agree. The tail traces shown in Fig1E do not show an increasing current amplitude in the voltage range from +50mV to +120mV, they seem to have reached a 'saturation state', suggesting that the traces from the test pulse contain an inward chloride current contamination. In addition, this second component identified by the authors as a second open state appears after +50mV and seems to never saturate. The normalization to the maximum current level during the test pulse, exaggerates this second component on the calculated G-V curve. It's worth noticing that the ΔPASCap mutant experiments on Fig 5 in Mes based solutions do not show that second component on the G-V.

      Because these results are the foundation for their two open state hypotheses, I will strongly suggest the authors to repeat all their Chloride-based experiments in Mes-based solutions to eliminate the undesired chloride contribution to the mutants current and clarify the contribution of the mutations to the Kv10.1 gating.

      (2) Two step gating mechanism.<br /> The authors interpret the results obtained with the ΔPASCap and the E600R as two step gating mechanisms containing two open states (O1 and O2) and assign them to the voltage sensor movement and gating ring rotation respectively. It is not clear, however how the authors assign the two open states.<br /> The results show how the first component is conserved amongst mutations; however, the second one is not. The authors attribute the second component, hence the second open state to the movement of the gating ring. This scenario seems unlikely since there is a clear voltage-dependence of the second component that will suggest an implication of a voltage-sensing current.

      The split channel experiment is interesting but needs more explanation. I assume the authors expressed the 2 parts of the split channel (1-341 and 342-end), however Tomczak et al showed in 2017 how the split presents a constitutively activated function with inward currents that are not visible here, this point needs clarification.

      Moreover, the authors assume that the mutations introduced uncover a new open state, however the traces presented for the mutations suggest that other explanations are possible. Other gating mechanisms like inactivation from the closed state, can be introduced by the mutations. The traces presented for ΔPASCap but specially E600R present clear 'hooked tails', a direct indicator of a populations of inactive channels during the test pulse that recover from inactivation upon repolarization (Tristani-Firouzi M, Sanguinetti MC. J Physiol. 1998). The results presented by the authors can be alternatively explained with a change in the equilibrium between the close to inactivated/recovery from inactivation to the open state. Finally, the authors state that they do not detect "cumulative inactivation after repeated depolarization" but that is considering inactivation only from the open state and ignoring the possibility of the existence of close state inactivation or, that like in hERG, that the channel inactivates faster that what it activates (Smith PL, Yellen G. J Gen Physiol. 2002).

      (3) Single channel conductance.<br /> The single channels experiments are a great way to assess the different conductance of single channel openings, unfortunately the authors cannot measure accurately different conductances for the two proposed open states. The Markov Model built by the authors, disagrees with their interpretation of the experimental results assigning the exact same conductance to the two modeled open states. To interpret the mutant data, it is needed to add data with the WT for comparison and in presence of specific blockers.

    1. eLife assessment

      This manuscript probes the ways in which a protein tag might influence the structure, dynamics and stability of a covalently-attached substrate protein. Such findings are of important significance to several fields, particularly in understanding how these influences control the abundance of proteins within a cell. The evidence provided to support the authors' conclusions are, however, incomplete and further control experiments are necessary to fully support the proposed model.

    2. Reviewer #1 (Public Review):

      This manuscript by Negi et al. investigates the effects of different ubiquitin and ubiquitin-like modifications on the stability of substrate proteins, seeking to provide mechanistic insights into known effects of these modifications on cellular protein abundance. The authors focus on comparative studies of two modifications, ubiquitin and FAT10 (a protein with two ubiquitin-like domains), on a panel of substrate proteins; prior work had established that FAT10-conjugated proteins had lower stability to proteosomal degradation than Ub-modified counterparts.

      Strengths of the work include its integration of data across diverse approaches, including molecular dynamics simulations, solution NMR spectroscopy, and in vitro and cellular stability assays. From these, the authors provide provocative mechanistic insight into the lower stability of FAT10 on its own, and in FAT10-mediated destabilization of substrate proteins in computational and experimental findings. Notably, such destabilization impacts both the tag and tagged proteins, raising some provocative questions about mechanism. The data here are generally compelling, albeit with minor concerns on presentation in parts. Conclusions from this work will be interesting to scientists in several fields, particularly those interested in cellular proteostasis and in vitro protein design / long-range communication.

      The most substantial weakness of this work from my perspective is the specificity of these destabilization effects. In particular, technical challenges of producing bona fide Ub- or FAT10-conjugated substrates with native linkages limits the ability to conduct in vitro studies on exactly the same molecules as being studied in cellular environments. Given some discussion in the manuscript about the importance of linkage location on the specificity of certain tag/substrate interactions, this raises an understandable but unfortunate caveat that needs to be considered more fully both in general and in light of data from other fields (e.g. single molecule pulling) showing site-dependence of comparable effects. I note that these concerns do not impact the caliber of the conclusions themselves, but perhaps suggest area for caution as to their potential impact at this time.

    3. Reviewer #2 (Public Review):

      "Plasticity of the proteasome-targeting signal Fat10 enhances substrate degradation" is a nice study where the authors have shown the differences between two protein degradation tags namely, FAT10 and ubiquitin. Even though these tags are closely related in terms of folds, they have differential efficiency in degrading the substrates covalently attached to them. The authors have utilised extensive MD simulations combined with biophysics and cell biology to show the structural dynamics these tags provide for proteasomal degradation.

    1. eLife assessment

      This study presents a valuable finding on the precision conferred by dynamical interpretation of morphogen gradients. The evidence supporting the claims of the authors is convincing, with compelling theoretical analysis and solid yet incomplete experimental data. With the experimental part strengthened, the work could be of interest to the developmental biology and developmental systems biology communities.

    2. Reviewer #1 (Public Review):

      This work focuses on the trade-off between precision and robustness in morphogen gradients of Hedgehog signaling. It presents a framework for how hedgehog signaling rises to precise responses and robust responses. This Framework is based on the characteristics of the hedgehog signaling pathway and specifically on the characteristics of the dynamical and stationary gradients that it forms in the Drosophila wing disc. On the one hand, the manuscript takes into account known results showing that the Hedgehog stationary gradient is robust due to a self-enhanced degradation (via activation of the Patched receptor). On the other hand, it uses the concept of dynamic interpretation of the gradient introduced by the leading author of this manuscript. According to this interpretation, different targets may be responding to a single signaling threshold and what differentiates the targets is whether they respond to the transient gradient, which extends over more cells, or if they respond to the stationary gradient. The Framework presented in this manuscript takes this prior knowledge and builds on it. The Framework proposes that the response from different targets will not be equally robust. Specifically, if the target responds to the stationary gradient, it will be a target with a robust response. Conversely, if the target responds to the gradient while it is being built, then it will be less robust but more precise. This framework is analyzed using mathematical models. Finally, experimental data that partially corroborate this framework are presented, focusing on the col and Dpp targets, which, according to previous results, read the stationary and transient gradients, respectively. To changes in Hh levels, the col pattern is more robust than the Dpp pattern. Furthermore, it is shown that this robustness decreases if the Patched receptor is not regulated. Hence, these experimental results confirm that the robustness is target-specific, as predicted by the models. The precision of the Dpp pattern is not tested experimentally.

    3. Reviewer #2 (Public Review):

      This paper presents a modeling analysis of a diffusing morphogen (hh) that patterns the wing disk by controlling the expression of dpp and col. Two modes of gene expression control/interpretation are analyzed and presented, one is a response using a steady state threshold (col), which could be robust (defined as a small spatial shift of the gene expression when hh dosage changes) by a ptch mediated negative feedback mechanism; the other is the "overshoot" where an earlier hh gradient profile pre-steady state is read at a threshold to activate the gene (dpp), which is less robust to dosage changes but has better boundary features. Experimental measurements of pattern widths of col and dpp were performed under different hh dosage to test the models. How these different modes were achieved by each gene was unclear.

      The reviewer found this study presents at best incremental advances to the field. It doesn't provide substantial progress conceptually or experimentally from Eldar et al., 2003, Adleman et al., 2022 and particularly Nahmad and Stathopoulos, 2009. The experimental data and interpretation appear to lack the rigor needed to challenge the model predictions.

      The authors pitched the difference between dpp and col in their response to hh dosage change as a tradeoff between robustness and precision. Specifically, the robustness refers to positioning and the precision refers to sharpness, which are somewhat arbitrary - as robustness could also refer to maintaining the sharpness of a expression boundary and precision can also refer to the position. Particularly for dpp, whose developmental significance of stripe position and sharpness is not analyzed (disc growth, pSmad, etc, for example - does a sharper but more mislocated dpp domain help the tissue?). The relationship between positioning and sharpness of a pattern in a morphogen system has been extensively discussed by many authors on a theoretcial level. The authors' theoretical analysis is clear and simple but not new. Experimental evidence indicates that dpp and col are regulated very differently by hh, particularly in terms of timing of response (Nahmad and Stathopoulos, 2009). No comparison of the GRNs from hh to these two genes was made or experimentally tested. It is difficult to conclude that their behaviors in response to hh dosage change are indeed from the hh gradient profile. It is also difficult to speculate if either of these genes (particularly dpp) is facing a true biological tradeoff or tuning back and forth between positioning and sharpness during evolution.

      Methods 4.5: To measure widths of gene expression patterns, the authors used a background subtraction, followed by normalization and then thresholded the boundary at 0.2 - this approach firstly is oversimplifying the profile of the expression gradient/profile which could be informative in model testing (e.g., sharpness of dpp?). Secondly, the sequence of the analysis steps may introduce larger errors to lower signal-to-noise images where the subtraction narrows the pattern more than those with higher signal-to-noise (e.g., the 18 degree vs 25 degree images, Fig.6A), this would result in errors in the width measurements. Importantly, disk size and wing size controls are not reported.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment 

      fMRI was used to address an important aspect of human cognition - the capacity for structured representations and symbolic processing - in a cross-species comparison with non-human primates (macaques); the experimental design probed implicit symbolic processing through reversal of learned stimulus pairs. The authors present solid evidence in humans that helps elucidate the role of brain networks in symbolic processing, however the evidence from macaques was incomplete (e.g., sample size constraints, potential and hard-to-quantify differences in attention allocation, motivation, and lived experience between species).

      Thank you very much for your assessment. We would like to address the potential issues that you raise point-by-point below.

      We agree that for macaque monkey physiology, sample size is always a constraint, due to both financial and ethical reasons. We addressed this concern by combining the results from two different labs, which allowed us to test 4 animals in total, which is twice as much as what is common practice in the field of primate physiology. (We discuss this now on lines 473-478.)

      Interspecies differences in motivation, attention allocation, task strategies etc. could also be limiting factors. Note that we did address the potential lack of attention allocation directly in Experiment 2 using implicit reward association, which was successful as evidenced by the activation of attentional control areas in the prefrontal cortex. We cannot guarantee that the strategies that the two species deploy are identical, but we tentatively suggest that this might be a less important factor in the present study than in other interspecies comparisons that use explicit behavioral reports. In the current study, we directly measured surprise responses in the brain in the absence of any explicit instructions in either species, which allowed us to  measure the spontaneous reversal of learned associations, which is a very basic element of symbolic representation. Our reasoning is that such spontaneous responses should be less dependent on attention allocation and task strategies. (We discuss this now in more detail on lines 478-485.)

      Finally, lived experience could be a major factor. Indeed, obvious differences include a lifetime of open-field experiences and education in our human adult subjects, which was not available to the monkey subjects, and includes a strong bias towards explicit learning of symbolic systems (e.g. words, letters, digits, etc). However, we have previously shown that 5-month-old human infants spontaneously generalize learning to the reversed pairs after a short learning in the lab using EEG (Kabdebon et al, PNAS, 2019). This indicates that also with very limited experience, humans spontaneously reverse learned associations. (We discuss this now in more detail on lines 478-485.) It could be very interesting to investigate whether spontaneous reversal could be present in infant macaque monkeys, as there might be a critical period for this effect. Although neurophysiology in awake infant monkeys is highly challenging, it would be very relevant for future work. (We discuss this in more detail on lines 493-498.)

      Public Reviews: 

      Reviewer #1 (Public Review): 

      Kerkoerle and colleagues present a very interesting comparative fMRI study in humans and monkeys, assessing neural responses to surprise reactions at the reversal of a previously learned association. The implicit nature of this task, assessing how this information is represented without requiring explicit decision-making, is an elegant design. The paper reports that both humans and monkeys show neural responses across a range of areas when presented with incongruous stimulus pairs. Monkeys also show a surprise response when the stimuli are presented in a reversed direction. However, humans show no such surprise response based on this reversal, suggesting that they encode the relationship reversibly and bidirectionally, unlike the monkeys. This has been suggested as a hallmark of symbolic representation, that might be absent in nonhuman animals. 

      I find this experiment and the results quite compelling, and the data do support the hypothesis that humans are somewhat unique in their tendency to form reversible, symbolic associations. I think that an important strength of the results is that the critical finding is the presence of an interaction between congruity and canonicity in macaques, which does not appear in humans. These results go a long way to allay concerns I have about the comparison of many human participants to a very small number of macaques. 

      We thank the reviewer for the positive assessment. We also very much appreciate the point about the interaction effect in macaque monkeys – indeed, we do not report just a negative finding. 

      I understand the impossibility of testing 30+ macaques in an fMRI experiment. However, I think it is important to note that differences necessarily arise in the analysis of such datasets. The authors report that they use '...identical training, stimuli, and whole-brain fMRI measures'. However, the monkeys (in experiment 1) actually required 10 times more training. 

      We agree that this description was imprecise. We have changed it to “identical training stimuli” (line 151), indeed the movies used for training were strictly identical. Furthermore, please note that we do report the fMRI results after the same training duration. In experiment 1, after 3 days of training, the monkeys did not show any significant results, even in the canonical direction. However, in experiment 2, with increased attention and motivation, a significant effect was observed on the first day of scanning after training, as was found in human subjects (see Figure 4 and Table 3).

      More importantly, while the fMRI measures are the same, group analysis over 30+ individuals is inherently different from comparing only 2 macaques (including smoothing and averaging away individual differences that might be more present in the monkeys, due to the much smaller sample size). 

      Thank you for understanding that a limited sampling size is intrinsic to macaque monkey physiology. We also agree that data analysis in humans and monkeys is necessarily different. As suggested by the reviewer, we added an analysis to address this, see the corresponding reply to the ‘Recommendations for the authors’ section below.

      Despite this, the results do appear to show that macaques show the predicted interaction effect (even despite the sample size), while humans do not. I think this is quite convincing, although had the results turned out differently (for example an effect in humans that was absent in macaques), I think this difference in sample size would be considerably more concerning. 

      Thank you for noting this. Indeed, the interaction effect is crucial, and the task design was explicitly made to test this precise prediction, described in our manuscript as the “reversibility hypothesis”. The congruity effect in the learned direction served as a control for learning, while the corresponding congruity effect in the reversed direction tested for spontaneous reversal. The reversibility hypothesis stipulates that in humans there should not be a difference between the learned and the reversed direction, while there should be for monkeys. We already wrote about that in the result section of the original manuscript and now also describe this more explicitly in the introduction and beginning of the result section.

      I would also note that while I agree with the authors' conclusions, it is notable to me that the congruity effect observed in humans (red vs blue lines in Fig. 2B) appears to be far more pronounced than any effect observed in the macaques (Fig. 3C-3). Again, this does not challenge the core finding of this paper but does suggest methodological or possibly motivational/attentional differences between the humans and the monkeys (or, for example, that the monkeys had learned the associations less strongly and clearly than the humans). 

      As also explained in response to the eLife assessment above, we expanded the “limitations” section of the discussion, with a deeper description of the possible methodological differences between the two species (see lines 478-485).

      With the same worry in mind, we did increase the attention and motivation of monkeys in experiment 2, and indeed obtained a greater activation to the canonical pairs and their violation, -notably in the prefrontal cortex – but crucially still without reversibility.

      In the end, we believe that the striking interspecies difference in size and extent of the violation effect, even for purely canonical stimuli, is an important part of our findings and points to a more efficient species-specific learning system, that our experiment tentatively relates to a symbolic competence.

      This is a strong paper with elegant methods and makes a worthwhile contribution to our understanding of the neural systems supporting symbolic representations in humans, as opposed to other animals. 

      We again thank the reviewer for the positive review.

      Reviewer #2 (Public Review): 

      In their article titled "Brain mechanisms of reversible symbolic reference: a potential singularity of the human brain", van Kerkoerle et al address the timely question of whether non-human primates (rhesus macaques) possess the ability for reverse symbolic inference as observed in humans. Through an fMRI experiment in both humans and monkeys, they analyzed the bold signal in both species while observing audio-visual and visual-visual stimuli pairs that had been previously learned in a particular direction. Remarkably, the findings pertaining to humans revealed that a broad brain network exhibited increased activity in response to surprises occurring in both the learned and reverse directions. Conversely, in monkeys, the study uncovered that the brain activity within sensory areas only responded to the learned direction but failed to exhibit any discernible response to the reverse direction. These compelling results indicate that the capacity for reversible symbolic inference may be unique to humans. 

      In general, the manuscript is skillfully crafted and highly accessible to readers. The experimental design exhibits originality, and the analyses are tailored to effectively address the central question at hand.

      Although the first experiment raised a number of methodological inquiries, the subsequent second experiment thoroughly addresses these concerns and effectively replicates the initial findings, thereby significantly strengthening the overall study. Overall, this article is already of high quality and brings new insight into human cognition. 

      We sincerely thank the reviewer for the positive comments. 

      I identified three weaknesses in the manuscript: 

      - One major issue in the study is the absence of significant results in monkeys. Indeed, authors draw conclusions regarding the lack of significant difference in activity related to surprise in the multidemand network (MDN) in the reverse congruent versus reverse incongruent conditions. Although the results are convincing (especially with the significant interaction between congruency and canonicity), the article could be improved by including additional analyses in a priori ROI for the MDN in monkeys (as well as in humans, for comparison). 

      First, we disagree with the statement about “absence of significant results in monkeys”. We do report a significant interaction which, as noted by the referee, is a crucial positive finding.

      Second, we performed the suggested analysis for experiment 2, using the bilateral ROIs of the putative monkey MDN from previous literature (Mitchell, et al. 2016), which are based on the human study by Fedorenko et al. (PNAS, 2013). 

      Author response table 1.

      Congruity effect for monkeys in Experiment 2 within the ROIs of the MDN (n=3). Significance was assessed with one-sided one-sample t-tests.

      As can be seen, none of the regions within the monkey MDN showed an FDR-corrected significant difference or interaction. Although the absence of a canonical congruity effect makes it difficult to draw strong conclusions, it did approach significance at an uncorrected level in the lateral frontal posterior region, similar to  the large prefrontal effect we report in Figures 4 and 5. Furthermore, for the reversed congruity effect there was never even a trend at the uncorrected level, and the crucial interaction of canonicity and congruity again approached significance in the lateral prefrontal cortex.  

      We also performed an ANOVA  in the human participants of the VV experiment on the average betas across the 7 different fronto-parietal ROIs as used by Mitchell et al to define their equivalent to the monkey brain (Fig 1a, right in Mitchell et al. 2016) with congruity, canonicity and hemisphere (except for the anterior cingulate which is a bilateral ROI) as within-subject factors. We confirmed the results presented in the manuscript (Figure 4C) with notably no significant interaction between congruity and canonicity in any of these ROIs (all F-values (except insula) <1). A significant main effect of congruity was observed in the posterior middle frontal gyrus (MFG) and inferior precentral sulcus at the FDR corrected level. Analyses restricted to the canonical trials found a congruity effect in these two regions plus the anterior insula and anterior cingulate/presupplementary motor area, whereas no ROIs were significant at a FDR corrected level for reverse trials. There was a trend in the middle MFG and inferior precentral region for reversed trials. Crucially, there was not even a trend for the interaction between congruity and canonicity at the uncorrected level. The difference in the effect size between the canonical and reversed direction can therefore be explained by the larger statistical power due to the larger number of congruent trials (70%, versus 10% for the other trial conditions), not by a significant effect by the canonical and the reversed direction. 

      Author response table 2.

      Congruity effect for humans in Experiment 2 within the ROIs of the MDN (n=23).

      These results support our contention that the type of learning of the stimulus pairs was very different in the two species. We thank the reviewer for suggesting these relevant additional analyses.

      - While the authors acknowledge in the discussion that the number of monkeys included in the study is considerably lower compared to humans, it would be informative to know the variability of the results among human participants. 

      We agree that this is an interesting question, although it is also very open-ended. For instance, we could report each subjects’ individual whole-brain results, but this would take too much space (and the interested reader will be able to do so from the data that we make available as part of this publication). As a step in this direction, we provide below a figure showing the individual congruity effects, separately for each experiment and for each ROI of table 5, and for each of the 52 participants for whom an fMRI localizer was available:

      Author response image 1.

      Difference in mean betas between congruent and incongruent conditions in a-priori linguistic and mathematical ROIs (see definition and analyses in Table 5) in both experiments (experiment 1 = AV, left panel; experiment 2= VV, right panel). Dots correspond to participants (red: canonical trials, green reversed trials).The boxplot notch is located at the median and the lower and upper box hinges at the 25th and 75th centiles. Whiskers extend to 1.5 inter-quartile ranges on either side of the hinges. ROIs are ranked by the median of the Incongruent-Congruent difference across canonical and reversed order,

      within a given experiment. For purposes of comparison between the two experiments, we have underlined with colors the top-five common ROIs between the two experiments. N.s.: non-significant congruity effect (p>0.05)

      Several regions show a rather consistent difference across subjects (see, for instance, the posterior STS in experiment 1, left panel). Overall, only 3 of the 52 participants did not show any beta superior to 2 in canonical or reversed in any ROIs. The consistency is quite striking, given the limited number of test trials (in total only 16 incongruent trials per direction per participant), and the fact that these ROIs were selected for their responses to spoken or written  sentences, as part of a subsidiary task quite different from the main task.

      - Some details are missing in the methods.  

      Thank you for these comments, we reply to them point-by-point below.

      Reviewer #3 (Public Review): 

      This study investigates the hypothesis that humans (but not non-human primates) spontaneously learn reversible temporal associations (i.e., learning a B-A association after only being exposed to A-B sequences), which the authors consider to be a foundational property of symbolic cognition. To do so, they expose humans and macaques to 2-item sequences (in a visual-auditory experiment, pairs of images and spoken nonwords, and in a visual-visual experiment, pairs of images and abstract geometric shapes) in a fixed temporal order, then measure the brain response during a test phase to congruent vs. incongruent pairs (relative to the trained associations) in canonical vs. reversed order (relative to the presentation order used in training). The advantage of neuroimaging for this question is that it removes the need for a behavioral test, which non-human primates can fail for reasons unrelated to the cognitive construct being investigated. In humans, the researchers find statistically indistinguishable incongruity effects in both directions (supporting a spontaneous reversible association), whereas in monkeys they only find incongruity effects in the canonical direction (supporting an association but a lack of spontaneous reversal). Although the precise pattern of activation varies by experiment type (visual-auditory vs. visual-visual) in both species, the authors point out that some of the regions involved are also those that are most anatomically different between humans and other primates. The authors interpret their finding to support the hypothesis that reversible associations, and by extension symbolic cognition, is uniquely human. 

      This study is a valuable complement to prior behavioral work on this question. However, I have some concerns about methods and framing. 

      We thank the reviewer for the careful summary of the manuscript, and the positive comments.

      Methods - Design issues: 

      The authors originally planned to use the same training/testing protocol for both species but the monkeys did not learn anything, so they dramatically increased the amount of training and evaluation. By my calculation from the methods section, humans were trained on 96 trials and tested on 176, whereas the monkeys got an additional 3,840 training trials and 1,408 testing trials. The authors are explicit that they continued training the monkeys until they got a congruity effect. On the one hand, it is commendable that they are honest about this in their write-up, given that this detail could easily be framed as deliberate after the fact. On the other hand, it is still a form of p-hacking, given that it's critical for their result that the monkeys learn the canonical association (otherwise, the critical comparison to the non-canonical association is meaningless). 

      Thank you for this comment. 

      Indeed, for experiment 1, the amount of training and testing was not equal for the humans and monkeys, as also mentioned by reviewer 2. We now describe in more detail how many training and imaging days we used for each experiment and each species, as well as the number of blocks per day and the number of trials per block (see lines 572-577). We also added the information on the amount of training receives to all of the legends of the Tables.

      We are sorry for giving the impression that we trained until the monkeys learned this. This was not the case. Based on previous literature, we actually anticipated that the short training would not be sufficient, and therefore planned additional training in advance. Specifically, Meyer & Olson (2011) had observed pair learning in the inferior temporal cortex of macaque monkeys after 816 exposures per pair. This is similar to the additional training we gave, about 80 blocks with 12 trials per pair per block. This is  now explained in more detail (lines 577-580).

      Furthermore, we strongly disagree with the pejorative term p-hacking. The aim of the experiment was not to show a congruency effect in the canonical direction in monkeys, but to track and compare their behavior in the same paradigm as that of humans for the reverse direction. It would have been unwise to stop after human-identical training and only show that humans learn better, which is a given. Instead, we looked at brain activations at both times, at the end of human-identical training and when the monkeys had learned the pairs in the canonical direction. 

      Finally, in experiment 2, monkeys were tested after the same 3 days of training as humans. We wrote: “Using this design, we obtained significant canonical congruity effects in monkeys on the first imaging day after the initial training (24 trials per pair), indicating that the animals had learned the associations” (lines 252-253).

      (2) Between-species comparisons are challenging. In addition to having differences in their DNA, human participants have spent many years living in a very different culture than that of NHPs, including years of formal education. As a result, attributing the observed differences to biology is challenging. One approach that has been adopted in some past studies is to examine either young children or adults from cultures that don't have formal educational structures. This is not the approach the authors take. This major confound needs to minimally be explicitly acknowledged up front. 

      Thank you for raising this important point. We already had a section on “limitations” in the manuscript, which we now extended (line 478-485). Indeed, this study is following a previous study in 5-month-old infants using EEG, in which we already showed that after learning associations between labels and categories, infants spontaneously generalize learning to the reversed pairs after a short learning period in the lab (Kabdebon et al, PNAS, 2019). We also cited preliminary results of the same paradigm as used in the current study but using EEG in 4-month-old infants (Ekramnia and Dehaene-Lambertz, 2019), where we replicated the results obtained by Kabdebon et al. 2019 showing that preverbal infants spontaneously generalize learning to the reversed pairs. 

      Functional MRI in awake infants remains a challenge at this age (but see our own work, DehaeneLambertz et al, Science, 2002), especially because the experimental design means only a few trials in the conditions of interest (10%) and thus a long experimental duration that exceed infants’ quietness and attentional capacities in the noisy MRI environment. (We discuss this on lines 493-496.)

      (3) Humans have big advantages in processing and discriminating spoken stimuli and associating them with visual stimuli (after all, this is what words are in spoken human languages). Experiment 2 ameliorates these concerns to some degree, but still, it is difficult to attribute the failure of NHPs to show reversible associations in Experiment 1 to cognitive differences rather than the relative importance of sound string to meaning associations in the human vs. NHP experiences. 

      As the reviewer wrote, we deliberately performed Experiment 2 with visual shapes to control for various factors that might have explained the monkeys' failure in Experiment 1. 

      (4) More minor: The localizer task (math sentences vs. other sentences) makes sense for math but seems to make less sense for language: why would a language region respond more to sentences that don't describe math vs. ones that do? 

      The referee is correct: our use of the word “reciprocally” was improper (although see Amalric et Dehaene, 2016 for significant differences in both directions when non-mathematical sentences concern specific knowledge). We changed the formulation to clarify this as follows: “In these ROIs, we recovered the subject-specific coordinates of each participant’s 10% best voxels in the following comparisons: sentences vs rest for the 6 language Rois ; reading vs listening for the VWFA ; and numerical vs non-numerical sentences for the 8 mathematical ROIs.” (lines 678-680).

      Methods - Analysis issues: 

      (5) The analyses appear to "double dip" by using the same data to define the clusters and to statistically test the average cluster activation (Kriegeskorte et al., 2009). The resulting effect sizes are therefore likely inflated, and the p-values are anticonservative. 

      It is not clear to us which result the reviewer is referring to. In Tables 1-4, we report the values that we found significant in the whole brain analysis, we do not report additional statistical tests for this data. For Table 5, the subject-specific voxels were identified through a separate localizer experiment, which was designed to pinpoint the precise activation areas for each subject in the domains of oral and written language-processing and math. Subsequently, we compared the activation at these voxel locations across different conditions of the main experiment. Thus, the two datasets were distinct, and there was no double dipping. In both interpretations of the comment, we therefore disagree with the reviewer.

      Framing: 

      (6) The framing ("Brain mechanisms of reversible symbolic reference: A potential singularity of the human brain") is bigger than the finding (monkeys don't spontaneously reverse a temporal association but humans do). The title and discussion are full of buzzy terms ("brain mechanisms", "symbolic", and "singularity") that are only connected to the experiments by a debatable chain of assumptions. 

      First, this study shows relatively little about brain "mechanisms" of reversible symbolic associations, which implies insights into how these associations are learned, recognized, and represented. But we're only given standard fMRI analyses that are quite inconsistent across similar experimental paradigms, with purely suggestive connections between these spatial patterns and prior work on comparative brain anatomy. 

      We agree with the referee that the term “mechanism” is ambiguous and, for systems neuroscientists, may suggest more than we are able to do here with functional MRI. We changed the title to “Brain areas for reversible symbolic reference, a potential singularity of the human brain”. This title better describes our specific contribution: mapping out the areas involved in reversibility in humans, and showing that they do not seem to respond similarly in macaque monkeys.

      Second, it's not clear what the relationship is between symbolic cognition and a propensity to spontaneously reverse a temporal association. Certainly, if there are inter-species differences in learning preferences this is important to know about, but why is this construed as a difference in the presence or absence of symbols? Because the associations aren't used in any downstream computation, there is not even any way for participants to know which is the sign and which is the signified: these are merely labels imposed by the researchers on a sequential task. 

      As explained in the introduction, the reversibility test addressed a very minimal core property of symbolic reference. There cannot be a symbol if its attachment doesn’t operate in both directions. Thus, this property is necessary – but we agree that it is not sufficient. Indeed, more tests are needed to establish whether and how the learned symbols are used in further downstream compositional tasks (as discussed in our recent TICS papers, Dehaene et al. 2022). We added a sentence in the introduction to acknowledge this fact:

      “Such reversibility is a core and necessary property of symbols, although we readily acknowledge that it is not sufficient, since genuine symbols present additional referential and compositional properties that will not be tested in the present work.” (lines 89-92).

      Third, the word "singularity" is both problematically ambiguous and not well supported by the results. "Singularity" is a highly loaded word that the authors are simply using to mean "that which is uniquely human". Rather than picking a term with diverse technical meanings across fields and then trying to restrict the definition, it would be better to use a different term. Furthermore, even under the stated definition, this study performed a single pairwise comparison between humans and one other species (macaques), so it is a stretch to then conclude (or insinuate) that the "singularity" has been found (see also pt. 2 above). 

      We have published an extensive review including a description of our use of the term “singularity” (Dehaene et al., TICS 2022). Here is a short except: “Humans are different even in domains such as drawing and geometry that do not involve communicative language. We refer to this observation using the term “human cognitive singularity”, the word singularity being used here in its standard meaning (the condition of being singular) as well as its mathematical sense (a point of sudden change). Hominization was certainly a singularity in biological evolution, so much so that it opened up a new geological age (the Anthropocene). Even if evolution works by small continuous change (and sometimes it doesn’t [4]), it led to a drastic cognitive change in humans.”

      We find the referee’s use of the pejorative term ”insinuate” quite inappropriate. From the title on, we are quite nuanced and refer only to a “potential singularity”. Furthermore, as noted above, we explicitly mention in the discussion the limitations of our study, and in particular the fact that only a single non-human species was tested (see lines 486-493). We are working hard to get chimpanzee data, but this is remarkably difficult for us, and we hope that our paper will incite other groups to collect more evidence on this point.

      (7) Related to pt. 6, there is circularity in the framing whereby the authors say they are setting out to find out what is uniquely human, hypothesizing that the uniquely human thing is symbols, and then selecting a defining trait of symbols (spontaneous reversible association) *because* it seems to be uniquely human (see e.g., "Several studies previously found behavioral evidence for a uniquely human ability to spontaneously reverse a learned association (Imai et al., 2021; Kojima, 1984; Lipkens et al., 1988; Medam et al., 2016; Sidman et al., 1982), and such reversibility was therefore proposed as a defining feature of symbol representation reference (Deacon, 1998; Kabdebon and DehaeneLambertz, 2019; Nieder, 2009).", line 335). They can't have it both ways. Either "symbol" is an independently motivated construct whose presence can be independently tested in humans and other species, or it is by fiat synonymous with the "singularity". This circularity can be broken by a more modest framing that focuses on the core research question (e.g., "What is uniquely human? One possibility is spontaneous reversal of temporal associations.") and then connects (speculatively) to the bigger conceptual landscape in the discussion ("Spontaneous reversal of temporal associations may be a core ability underlying the acquisition of mental symbols").

      We fail to understand the putative circularity that the referee sees in our introduction. We urge him/her to re-read it, and hope that, with the changes that we introduced, it does boil down to his/her summary, i.e. “What is uniquely human? One possibility is spontaneous reversal of temporal associations."

      Reviewer #1 (Recommendations For The Authors): 

      In general, the manuscript was very clear, easy to read, and compelling. I would recommend the authors carefully check the text for consistency and minor typos. For example: 

      The sample size for the monkeys kept changing throughout the paper. E.g., Experiment 1: n = 2 (line 149); n = 3 (line 205).  

      Thank you for catching this error, we corrected it. The number of animals was indeed 2  for experiment 1, and 3 for experiment 2. (Animals JD and YS participated in experiment 1 and JD, JC and DN in experiment 2. So only JD participated in both experiments.)

      Similarly, the number of stimulus pairs is reported inconsistently (4 on line 149, 5 pairs later in the paper). 

      We’re sorry that this was unclear. We used 5 sets of 4 audio-visual pairs each. We now clarify this, on line 157 and on lines 514-516.

      At least one case of p>0.0001, rather than p < 0.0001 (I assume). 

      Thank you once again, we now corrected this.

      Reviewer #2 (Recommendations For The Authors): 

      One major issue in the study is the absence of significant results in monkeys. Indeed, the authors draw conclusions regarding the lack of significant difference in activity related to surprise in the multidemand network (MDN) in the reverse congruent versus reverse incongruent conditions. Although the results are convincing (especially with the significant interaction between congruency and canonicity), the article could be improved by including additional analyses in a priori ROI for the MDN in monkeys (as well as in humans, for comparison). In other words: what are the statistics for the MDN regarding congruity, canonicity, and interaction in both species? Since the authors have already performed this type of analysis for language and Math ROIs (table 5), it should be relatively easy for them to extend it to the MDN. Demonstrating that results in monkeys are far from significant could further convince the reader. 

      Furthermore, while the authors acknowledge in the discussion that the number of monkeys included in the study is considerably lower compared to humans, it would be informative to know the variability of the results among human participants. Specifically, it would be valuable to describe the proportion of human participants in which the effects of congruency, canonicity, and their interaction are significant. Additionally, stating the variability of the F-values for each effect would provide reassurance to the reader regarding the distinctiveness of humans in comparison to monkeys. Low variability in the results would serve to mitigate concerns that the observed disparity is merely a consequence of testing a unique subset of monkeys, which may differ from the general population. Indeed, this would be a greater support to the notion that the dissimilarity stems from a genuine distinction between the two species. 

      We responded to both of these points above.

      In terms of methods, details are missing: 

      - How many trials of each condition are there exactly? (10% of 44 trials is 4.4) : 

      We wrote: “In both humans and monkeys, each block started with 4 trials in the learned direction (congruent canonical trials), one trial for each of the 4 pairs (2 O-L and 2 L-O pairs). The rest of the block consisted of 40 trials in which 70% of trials were identical to the training; 10% were incongruent pairs but the direction (O-L or L-O) was correct (incongruent canonical trials), thus testing whether the association was learned; 10% were congruent pairs but the direction within the pairs was reversed relative to the learned pairs (congruent reversed trials) and 10% were incongruent pairs in reverse (incongruent reversed trials).”(See lines 596-600.)

      Thus, each block comprised 4 initial trials, 28 canonical congruent trials, 4 canonical incongruent, 4 reverse congruent and 4 reverse incongruent trials, i.e. 4+28+3x4=40 trials.

      - How long is one trial? 

      As written in the method section: “In each trial, the first stimulus (label or object) was presented during 700ms, followed by an inter-stimulus-interval of 100ms then the second stimulus during 700ms. The pairs were separated by a variable inter-trial-interval of 3-5 seconds” i.e. 700+100+700=1500, plus 3 to 4.75 seconds of blank between the trials (see lines 531-533).

      - How are the stimulus presentations jittered? 

      See : “The pairs were separated by a variable inter-trial-interval randomly chosen among eight different durations between 3 and 4.75 seconds (step=250 ms). The series of 8 intervals was randomized again each time it was completed.”(lines 533-535).

      - What is the statistical power achieved for humans? And for monkeys? 

      We know of no standard way to define power for fMRI experiments. Power will depend on so many parameters, including the fMRI signal-to-noise ratio, the attention of the subject, the areas being considered, the type of analysis (whole-brain versus ROIs), etc.

      - Videos are mentioned in the methods, is it the image and sound? It is not clear. 

      We’re sorry that it was unclear. Video’s were only used for the training of the human subjects. We now corrected this in the method section (lines 552-554).

      Reviewer #3 (Recommendations For The Authors): 

      The main recommendations are to adjust the framing (making it less bold and more connected to the empirical evidence) and to ensure independence in the statistical analyses of the fMRI data. 

      See our replies to the reviewer’s comments on “Framing” above. In particular, we changed the title of the paper from “Brain mechanisms of reversible symbolic reference” to “Brain areas for reversible symbolic reference”.

      References cited in this response

      Dehaene, S., Al Roumi, F., Lakretz, Y., Planton, S., & Sablé-Meyer, M. (2022). Symbols and mental programs : A hypothesis about human singularity. Trends in Cognitive Sciences, 26(9), 751‑766. https://doi.org/10.1016/j.tics.2022.06.010.

      Dehaene-Lambertz, Ghislaine, Stanislas Dehaene, et Lucie Hertz-Pannier. Functional Neuroimaging of Speech Perception in Infants. Science 298, no 5600 (2002): 2013-15. https://doi.org/10.1126/science.1077066.

      Ekramnia M, Dehaene-Lambertz G. 2019. Investigating bidirectionality of associations in young infants as an approach to the symbolic system. Presented at the CogSci. p. 3449.

      Fedorenko E, Duncan J, Kanwisher N (2013) Broad domain generality in focal regions of frontal and parietal cortex. Proc Natl Acad Sci U S A 110:16616-16621.

      Kabdebon, Claire, et Ghislaine Dehaene-Lambertz. « Symbolic Labeling in 5-Month-Old Human Infants ». Proceedings of the National Academy of Sciences 116, no 12 (2019): 5805-10. https://doi.org/10.1073/pnas.1809144116.

      Mitchell, D. J., Bell, A. H., Buckley, M. J., Mitchell, A. S., Sallet, J., & Duncan, J. (2016). A Putative Multiple-Demand System in the Macaque Brain. Journal of Neuroscience, 36(33), 8574‑8585. https://doi.org/10.1523/JNEUROSCI.0810-16.2016

    2. eLife assessment

      fMRI was used to address an important aspect of human cognition - the capacity for structured representations and symbolic processing - in a cross-species comparison with macaques; the experimental design probed implicit symbolic processing through reversal of learned stimulus pairs. The authors present solid evidence in humans that helps elucidate the role of brain networks in symbolic processing, however the evidence from macaques was necessarily incomplete (e.g., hard-to-quantify differences in learning trajectories and lived experience between species).

    3. Reviewer #1 (Public Review):

      Kerkoerle and colleagues present a very interesting comparative fMRI study in humans and monkeys, assessing neural responses to surprise reactions at the reversal of a previously learned association. The implicit nature of this task, assessing how this information is represented without requiring explicit decision making, is an elegant design. The paper reports that both humans and monkeys show neural responses across a range of areas when presented with incongruous stimulus pairs. Monkeys also show a surprise response when the stimuli are presented in the reversed direction. However, humans show no such surprise response based on this reversal, suggesting that they encode the relationship reversibly and bidirectionally, unlike the monkeys. This has been suggested as a hallmark of symbolic representation, that might be absent in nonhuman animals.

      I find this experiment and the results quite compelling, and the data do support the hypothesis that humans are somewhat unique in their tendency to form reversible, symbolic associations. I think that an important strength of the results is that the critical finding is the presence of an interaction between congruity and canonicity in macaques, which does not appear in humans. These results go a long way to allay concerns I have about the comparison of many human participants to a very small number of macaques.

      The results do appear to show that macaques show the predicted interaction effect (even despite the sample size), while humans do not. I think this is quite convincing. (Although had the results turned out differently (for example an effect in humans that was absent in macaques), I think this difference in sample size would be considerably more concerning.)

      I would also note that while I agree with the authors conclusions, it is also notable to me that the congruity effect observed in humans (red vs blue lines in Fig. 2B) appears to be far more pronounced than any effect observed in the macaques (Fig. 3C-3). Again, this does not challenge the core finding of this paper but does suggest methodological or possibly motivational/attentional differences between the humans and the monkeys (or, for example, that the monkeys had learned the associations less strongly and clearly than the humans). The authors now discuss this more fully.

      This is a strong paper with elegant methods and makes a worthwhile contribution to our understanding of the neural systems supporting symbolic representations in humans, as opposed to other animals.

    4. Reviewer #2 (Public Review):

      In their article titled, van Kerkoerle et al address the timely question of whether non-human primates (rhesus macaques) possess the ability for reverse symbolic inference as observed in humans. Through an fMRI experiment in both humans and monkeys, they analyzed the bold signal in both species while observing audio-visual and visual-visual stimuli pairs that had been previously learned in a particular direction. Remarkably, the findings pertaining to humans revealed that a broad brain network exhibited increased activity in response to surprises occurring in both the learned and reverse directions. Conversely, in monkeys, the study uncovered that the brain activity within sensory areas only responded to the learned direction but failed to exhibit any discernible response to the reverse direction. These compelling results indicate that the capacity for reversible symbolic inference may be specific to humans, even though it remains to be tested in other species.

      In general, the manuscript is skillfully crafted and highly accessible to readers. The experimental design exhibits originality, and the analyses are tailored to effectively address the central question at hand. Although the first experiment raised a number of methodological inquiries, the subsequent second experiment thoroughly addresses these concerns and effectively replicates the initial findings, thereby significantly strengthening the overall study. Overall, this article is of high quality and brings new insight into human cognition.

      The main limitation of the studies is the sample size of the non-human primate group (n=2 and n=3). Nevertheless, this limitation is carefully addressed and discussed in the manuscript.

    5. Reviewer #3 (Public Review):

      Original review

      This study investigates the hypothesis that humans (but not non-human primates) spontaneously learn reversible temporal associations (i.e., learning a B-A association after only being exposed to A-B sequences), which the authors consider to be a foundational property of symbolic cognition. To do so, they expose humans and macaques to 2-item sequences (in a visual-auditory experiment, pairs of images and spoken nonwords, and in a visual-visual experiment, pairs of images and abstract geometric shapes) in a fixed temporal order, then measure the brain response during a test phase to congruent vs. incongruent pairs (relative to the trained associations) in canonical vs. reversed order (relative to the presentation order used in training). The advantage of neuroimaging for this question is that it removes the need for a behavioral test, which non-human primates can fail for reasons unrelated to the cognitive construct being investigated. In humans, the researchers find statistically indistinguishable incongruity effects in both directions (supporting a spontaneous reversible association), whereas in monkeys they only find incongruity effects in the canonical direction (supporting an association but a lack of spontaneous reversal). Although the precise pattern of activation varies by experiment type (visual-auditory vs. visual-visual) in both species, the authors point out that some of the regions involved are also those that are most anatomically different between humans and other primates. The authors interpret their findings to support the hypothesis that reversible associations, and by extension symbolic cognition, is uniquely human.

      This study is a valuable complement to prior behavioral work on this question. However, I have some concerns about methods and framing.

      Methods - Design issues:

      (1) The authors originally planned to use the same training/testing protocol for both species but the monkeys did not learn anything, so they dramatically increased the amount of training and evaluation. By my calculation from the methods section, humans were trained on 96 trials and tested on 176, whereas the monkeys got an additional 3,840 training trials and 1,408 testing trials. The authors are explicit that they continued training the monkeys until they got a congruity effect. On the one hand, it is commendable that they are honest about this in their write-up, given that this detail could easily be framed as deliberate after the fact. On the other hand, it is still a form of p-hacking, given that it's critical for their result that the monkeys learn the canonical association (otherwise, the critical comparison to the non-canonical association is meaningless).

      (2) Between-species comparisons are challenging. In addition to having differences in their DNA, human participants have spent many years living in a very different culture than that of NHPs, including years of formal education. As a result, attributing the observed differences to biology is challenging. One approach that has been adopted in some past studies is to examine either young children or adults from cultures that don't have formal educational structures. This is not the approach the authors take. This major confound needs to minimally be explicitly acknowledged up front.

      (3) Humans have big advantages in processing and discriminating spoken stimuli and associating them to visual stimuli (after all, this is what words are in spoken human languages). Experiment 2 ameliorates these concerns to some degree, but still it is difficult to attribute the failure of NHPs to show reversible associations in Experiment 1 to cognitive differences rather than the relative importance of sound string to meaning associations in the human vs. NHP experiences.

      (4) More minor: The localizer task (math sentences vs. other sentences) makes sense for math but seems to make less sense for language: why would a language region respond more to sentences that don't describe math vs. ones that do?

      Methods - Analysis issues:

      (5) The analyses appear to "double dip" by using the same data to define the clusters and to statistically test the average cluster activation (Kriegeskorte et al., 2009). The resulting effect sizes are therefore likely inflated, and the p-values are anticonservative.

      FRAMING:

      (6) The framing ("Brain mechanisms of reversible symbolic reference: A potential singularity of the human brain") is bigger than the finding (monkeys don't spontaneously reverse a temporal association but humans do). The title and discussion are full of buzzy terms ("brain mechanisms", "symbolic", and "singularity") that are only connected to the experiments by a debatable chain of assumptions.

      First, this study shows relatively little about brain "mechanisms" of reversible symbolic associations, which implies insights about how these associations are learned, recognized, and represented. But we're only given standard fMRI analyses that are quite inconsistent across similar experimental paradigms, with purely suggestive connections between these spatial patterns and prior work on comparative brain anatomy.

      Second, it's not clear what the relationship is between symbolic cognition and a propensity to spontaneously reverse a temporal association. Certainly if there are inter-species differences in learning preferences this is important to know about, but why is this construed as a difference in the presence or absence of symbols? Because the associations aren't used in any downstream computation, there is not even any way for participants to know which is the sign and which is the signified: these are merely labels imposed by the researchers on a sequential task.

      Third, the word "singularity" is both problematically ambiguous and not well supported by the results. "Singularity" is a highly loaded word that the authors are simply using to mean "that which is uniquely human". Rather than picking a term with diverse technical meanings across fields and then trying to restrict the definition, it would be better to use a different term. Furthermore, even under the stated definition, this study performed a single pairwise comparison between humans and one other species (macaques), so it is a stretch to then conclude (or insinuate) that the "singularity" has been found (see also pt. 2 above).

      (7) Related to pt. 6, there is circularity in the framing whereby the authors say they are setting out to find out what is uniquely human, hypothesizing that the uniquely human thing is symbols, and then selecting a defining trait of symbols (spontaneous reversible association) *because* it seems to be uniquely human (see e.g., "Several studies previously found behavioral evidence for a uniquely human ability to spontaneously reverse a learned association (Imai et al., 2021; Kojima, 1984; Lipkens et al., 1988; Medam et al., 2016; Sidman et al., 1982), and such reversibility was therefore proposed as a defining feature of symbol representation reference (Deacon, 1998; Kabdebon and Dehaene-Lambertz, 2019; Nieder, 2009).", line 335). They can't have it both ways. Either "symbol" is an independently motivated construct whose presence can be independently tested in humans and other species, or it is by fiat synonymous with the "singularity". This circularity can be broken by a more modest framing that focuses on the core research question (e.g., "What is uniquely human? One possibility is spontaneous reversal of temporal associations.") and then connects (speculatively) to the bigger conceptual landscape in the discussion ("Spontaneous reversal of temporal associations may be a core ability underlying the acquisition of mental symbols").

      Comments on revised version:

      I thank the authors for engaging constructively with my comments. I'm convinced by the responses to my original points 1, 2, 3, and 4. I'm also partially convinced by the response to point 6 (with qualifications discussed below). I do want to clear the record on points 1 and 6 (about which the authors expressed offense at aspects of my original comments), and to press on points 5 and 7.

      (1) It's very helpful to know that the plan was always to extend training in Expt 1. The rationale is now clear in the methods, although I'd encourage the authors to also emphasize this if space permits in the vicinity of lines 211-216, which still read as if the extended training was a post hoc decision ("the canonical congruity effect... was not significant... after 3 days of exposure... Thus... monkeys were further exposed..."). The authors have objected to my original use of "p hacking", which I agree was too strong (my apologies). My intention was only to point out that *if it were the case that training duration was conditional on the monkeys' success at learning the canonical association* (which the authors have now clarified was not the case), then this would be steering the study post hoc to achieve a desired outcome. I recognize the authors' point that the canonical direction was a sanity check, not the effect of interest (reversed association), but it's still true that they needed to achieve this sanity check in order for the absence of a reversed effect to be meaningful. This was the source of my original concern. This point is only clarificational (no action is recommended).

      (5) The authors have said they don't understand my concern about "double-dipping" in the statistical analyses, so I will attempt to clarify. First, I should stress that this concern applies only to the whole-brain results (Tables 1-4), not the fROI results. As the authors point out, this was indeed unclear, and I apologize. My concern about Tables 1-4 is that they seem to be derived using the classical technique of thresholding contrasts at some significance level to define clusters and then reporting cluster statistics (in this case, t-values) derived from *the same contrast in the same activation maps*. If this is not what was done (i.e., if orthogonal data and/or contrasts were used to define clusters and quantify contrasts within clusters, as in the fROI analyses), then this point is moot (and clarification in the paper would be helpful). But if this is what was done, then this procedure is known to be distortionary (e.g., Kriegeskorte et al 2009, "Nonindependent selective analysis is incorrect and should not be acceptable in neuroscientific publications").

      (6) The authors have objected to my use of the term "insinuate" as pejorative. I don't share this impression (and insult was certainly not my intent) but I'm happy to concede that a less loaded term (e.g., "suggest") would have been a better choice. I apologize. In any case, I stand by my intended original concern that a key idea in this piece (that reversible symbolic inference is a singularity of the human brain) is being advanced rhetorically rather than empirically, by repeatedly supplying it to readers (albeit with qualifiers like "potential") as an interpretive lens through which to view empirical results that only directly support a more modest claim (that macaques spontaneously reverse sequential associations less readily than humans do). To be clear, it is good that the authors don't make this stronger claim outright, and it is fine to motivate a more modest research question (e.g., do species differ in spontaneous reversal of associations) on the grounds that it is a stepping stone to a bigger one (what is the singularity). But by placing the bigger framing front and center in this way, there's a risk that this paper will be received by the community as establishing a conclusion that it does not actually establish.

      (7) The authors have said they don't understand the circularity I'm alleging. Having read the revision, I believe the issue is still there, so I'll make another attempt. The problem is most clearly apparent in the Discussion text quoted in my original comment (lines 347-350 of the revision, emphasis mine): "Several studies previously found behavioural evidence for a *uniquely human* ability to spontaneously reverse a learned association (Imai et al., 2021; Kojima, 1984; Lipkens et al., 1988; Medam et al., 2016; Sidman et al., 1982), and such reversibility was *therefore* proposed as a defining feature of symbol representation reference (Deacon, 1998; Kabdebon and Dehaene-Lambertz, 2019; Nieder, 2009)." In other words, reversal of associations is selected as a defining feature of symbols and targeted by this study *because* it is thought to be uniquely human. This is fine, but it prohibits you from then advocating the hypothesis that symbolic cognition is the singularity (lines 49-52), because "symbol" is being defined such that this is necessarily the case. To minimally paraphrase what I perceive to be the circular logic in the framing, the argument seems to go: "What is uniquely human? Symbols. What are symbols? That which is uniquely human." In my original comment, I suggested a reframing that would fix this issue, namely: "What is uniquely human? Spontaneous reversal of temporal associations." The authors say they don't see the difference between this framing and their own, so I'll try to clarify: the difference is that it sidesteps the notion of "symbol", and in so doing removes the circular definitions of "symbol" and "singularity" in terms of each other. This suggestion was given not as a prescription but as an example to show that the issue can be remedied by revisions to the framing without doing damage to the empirical claims. If the authors prefer a different remedy that avoids circular definitions of terms, that's fine.

    1. Author response:

      We thank the reviewers for their thorough comments on our manuscript. We appreciate their recognition of the strengths in our study, including addressing the significant problem of neonatal sepsis in preterm infants using a preterm piglet model, the robustness of our multi-omics dataset, and our multi-pronged approach to examining the physiological changes under different glucose management regimens.

      This document addresses our initial responses to the main concerns of the 3 reviewers. We will provide more detailed responses to their comments and revise the manuscript at a later date.

      In response to Reviewer #1, we acknowledge the concern about high blood glucose levels in the control group. This work is a follow-up from our previous work (Muk et al, JCI insight 2022) where we explored different PN glucose regimens. Taken together, our experiments suggest a linear relationship between glucose provision and infection severity, indicating increased glucose may heighten mortality risk, while radical reduction could reduce mortality due to sepsis, but cause hypoglycemia and brain damage. As for the discrepancy in survival rates between Figures 1B and 6B, this is due to a shortened follow-up time in the follow-up experiment. This was done to minimize animal suffering because relevant differences in immune-responses were detectable within 12 hours in the primary experiment. As for the relationship between bacterial burdens and glucose, we agree that lower bacterial density in piglets receiving the reduced glucose PN may result from slower bacterial growth. However, we analyzed the relationship between bacterial burdens and mortality and found that it did not correlate within each of the treatment groups. This finding inspired us to further explore the relationship between bacterial burdens and infection responses in our model which has resulted in our recent preprint: Wu et at. Regulation of host metabolism and defense strategies to survive neonatal infection. BioRxiv 2024.02.23.581534; doi: https://doi.org/10.1101/2024.02.23.581534

      For Reviewer #2, The distinction between early (EOS) and late onset sepsis (LOS) in the time cut-off makes sense clinically because they are likely to be caused by different organisms and origins (EOS with maternal origin and LOS with postnatal origin) and therefore require different empirical antibiotics regimes. However, it is also important to acknowledge that the pathophysiology of “sepsis” may be similar despite timing and pathogen and depends on the degree of immune activation. Therefore, even though the infection in our model is initiated on the first day after birth the organism that we use, Staphylococcus epidermidis (most common bacteria detected in LOS), makes it a better model for LOS. As for neutrophil specific transcripts, we only collected the whole blood transcript during the experiments, which reflects the transcriptomic profile of all the leucocytes. Since we did not do single cell RNA sequencing during the experiment there is no possibility of isolating the neutrophil transcriptome at this time. As for the question of a “safe glucose infusion rate”, there likely is none as the immune responses to glucose intake do not seem binary but increase with glucose intake. Our reduced glucose PN was chosen as it corresponded with the low end of recommended guidelines for PN glucose intake. However, the reduced glucose intervention still resulted in significant morbidity and a 25% mortality within 22 hours. There is therefore still vast room for improvement, but even though further reduction in PN glucose intake would probably provide further protection, it would entail dangerous hypoglycemia. The findings in this paper have prompted us to explore several alternative strategies to both reduce infection-related mortality and maintain glucose homeostasis. Thus, the optimal PN for infected newborns would probably differ from standard PN in all macronutrients compartments and will require much more pre- and clinical research.

      For Reviewer #3, we acknowledge the variability in data collected from animals at euthanasia. These endpoints represent snapshots of the animals' states at euthanasia, which is a clear limitation of our method. Therefore, we do not know what metabolic processes precede the development of lethal sepsis, although the increases in plasma lactate suggest a higher rate of glycolysis in animals on high glucose PN. However, we believe the data still heavily imply a causal relationship between energy metabolic processes, especially glycolytic breakdown of glucose, and the pro-inflammatory responses leading to sepsis. In our recent preprint mentioned above we further explored the metabolic responses in pigs that succumbed to sepsis, compared to those that survived and found that survival was strongly associated with increases in mitochondrial metabolism and reduction in glycolysis.

      We hope these clarifications and our commitment to further research address your concerns satisfactorily. Thank you for your valuable feedback.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer 1 (Public Review):

      “but an obvious influencing factor that the authors could investigate in their own data set is the retinal input. In Fig1b, the authors even show these data in the form of gaze and pupil size. In these example data, by eye, it looks like the pupil size is positively correlated with the run speed. This would of course have large consequences on the activity in V1, but the authors do not do anything with these data. The study would improve substantially if the authors would correlate their run speed traces with other factors that they have recorded too, such as pupil size and gaze.”

      Absolutely. We have added a first level of eye movement (and pupil size) analyses to the revised manuscript, resulting in an additional figure. In short, we found that eye movements are unlikely to play a significant role in our primary results, as the patterns of eye movements differed only slightly between running and stationary periods, and the measured impacts of such eye movements were also quantitatively much smaller than the primary effect sizes.

      We also note that in analyzing the eye movements, we also found that pupil size was larger during running than stationary. This is suggestive evidence that running is correlated with increases in arousal. Although more work will be needed to calibrate and quantify how much this factor affects neural responses (and perhaps to dissociate it from running per se), the simple analysis we present suggest that the large differences we observe could be explained by a difference between how arousal and running are correlated in the monkey versus the mouse. Instead, it appears that both species have at least qualitatively similar relations between pupil size (a standard proxy for arousal) and running.

      On this issue, we have added extensive discussion of the relevant recent work by Talluri et al. (2023) who attempted a similar cross-species analysis that considered spontaneous body movements and their effect on cortical activity (as well as the possibility that eye movements are a critical mediator in these modulations). Due to delays in revising our manuscript, we regret that our earlier submission had not cited this work, but we now do our best to highlight its importance and the synergy between these two papers. The full citation is listed below:

      Talluri BC, Kang I, Lazere A, Quinn KR, Kaliss N, Yates JL, Butts DA, Nienborg H. Activity in primate visual cortex is minimally driven by spontaneous movements. Nat Neurosci. 2023 Nov;26(11):1953-1959. doi: 10.1038/s41593-023-01459-5.

      There is a finer level of analysis that we hope to do in the future along these lines. It would rely on detailed characterization of each receptive field, building an image-computable model linking those receptive fields to the neural activity, and doing so at a finer time grain that links individual eye movements and changes in the spike train within a stimulus presentation (as opposed to working at the level of spike counts per stimulus presentation). Because these steps need to be accomplished together— and each requires substantial additional work and would go beyond the first-order findings we report in this work— we hope to report on such finer analyses in a standalone paper later. We are working on being able to do this in both marmoset and mouse.

      More generally, we want to emphatically agree that what is missing from this paper is the “why?”! We have done our best to show that a fair comparison reveals quantitatively different phenomena in marmoset and mouse. In the revised discussion, we lay out many lines of work that we hope will gain traction on this deeper mechanistic point. There’s a lot to do, and several of the possibilities are already current topics of exploration in our ongoing work.

      “Looking at the raster plot, however, shows that this strong positive correlation must be due entirely to the lower half of the neurons significantly increasing their firing rate as the mouse starts to run; in fact, the upper 25% or so of the neurons show exactly the opposite (strong suppression of the neurons as the mouse starts running). It would be more balanced if this heterogeneity in the response is at least mentioned somewhere in the text.”

      We are also intrigued by the heterogeneity of effects at the single neuron level. That is why the next section of the paper is dedicated to analyzing effects on a cell-by-cell basis. The fractions of neurons showing either increases or decreases are described separately, to get at this very issue.

      Reviewer 2 (Public Review)::

      “For example, it is known that the locomotion gain modulation varies with layer in the mouse visual cortex, with neurons in the infragranular layers expressing a diversity of modulations (Erisken et al. 2014 Current Biology). However, for the marmoset dataset, it was not reported from which cortical layer the neurons are from, leaving this point unanswered.”

      Reviewer 2 called for more consideration of details that have been addressed in the mouse literature, such as the cortical layer of the cells, and related aspects of circuitry. We have greatly re-worked the Discussion to address several of these issues. In short, the manuscript’s set of data were collected without strong traction on layers or cell types, and it will be quite interesting to get a better handle on this using both refinements to our recording procedures as well as new techniques that are now possible in the marmoset for future studies.

      “In this regard, it is worth noting that the authors report an interesting difference between the foveal and peripheral parts of the visual cortex in marmoset. It will be interesting to investigate these differences in more detail in future studies. Likewise, while running might be an important behavioral state for mice, other behavioral states might be more relevant for marmosets and do modulate the activity of the primate visual cortex more profoundly. Future work could leverage the opportunities that the marmoset model system offers to reveal new insights about behavioral-related modulation in the primate brain.”

      Same page! We have expanded the discussion to better emphasize these points and are already deep in follow up experiments to explore the foveal and peripheral representations.

      Reviewer 3 (Public Review)::

      “However, the authors did not take full advantage of the quantity and diversity of the marmoset visual cortex recordings in their analyses. They mention recording and analyzing the activity of peripheral V1 neurons but mainly present results involving foveal V1 neurons. Foveal neurons, with their small receptive fields strongly affected by precise eye position, would seem to be less likely to be comparable to rodent data. If the authors have a reason for not doing so, they should provide an explanation.”

      We agree, and hope the reviewer finds our overall reply, detailed response to Reviewer 1 (who raised a similar issue), and corresponding updates to the manuscript appropriate for this stage of understanding.

      “Given that the marmosets are motivated to run with liquid rewards, the authors should provide more context as to how this may or may not affect marmoset V1 activity. Additionally, the lack of consideration of eye movements or position presents a major absence for the marmoset results, and fails to take advantage of one of the key differences between primate and rodent visual systems - the marmosets have a fovea, and make eye movements that fixate in various locations on the screen during the task.”

      In addition to the response above, we have made edits to the manuscript to speak to issues of arousal and eye movements (also detailed in previous responses). Given the modest decrease in activity we see, the usual concerns about potential increases in neural activity related to eye movements (which we quantify in the revision) and other issues related to motivation are hard to specifically relate to existing literature. But in the revised Discussion we talk more about how future work can/should dissociate these factors, as has been done in the mouse literature.

      “Finally, the model provides a strong basis for comparison at the level of neuronal populations, but some methodological choices are insufficiently described and may have an impact on interpreting the claims.”

      We have also clarified the shared-gain model’s description, which we agree needed additional detail and clarification.

    2. eLife assessment

      This important work advances our understanding of the differences in locomotion-induced modulation in primate and rodent visual cortexes and underlines the significant contribution cross-species comparisons make to investigating brain function. The evidence in support of these differences across species is convincing. This work will be of broad interest to neuroscientists.

    3. Reviewer #1 (Public Review):

      More than ten years ago, it was shown that activity in the primary visual cortex of mice substantially increases when mice are running compared to when they are sitting still. This finding 'revolutionised' our thinking about visual cortex, turning away from it being a passive image processor and highlighting the influence of non-visual factors. The current study now for the first time repeats this experiment in marmosets. The authors find that in contrast to mice, marmoset V1 activity is slightly suppressed during running, and they relate this to differences in gain modulations of V1 activity between the two species.

      Strengths

      - Replication in primates of the original finding in mice partly took so long, because of the inherent difficulties with recording from the brain of a running primate. In fact one recent, highly related study on macaques looked at spontaneous limb movements as the macaque was sitting. The treadmill for the marmosets in the current study is a very elegant solution to the problem of running in primates. It allows for true replication of the 'running vs stationary' experiment and undoubtedly opens up many possibilities for other experiments recording from a head-fixed but active marmoset.<br /> - In addition to their own data in marmoset, the authors run their analyses on a publicly available data set in mouse. This allows them to directly compare mouse and marmoset findings, which significantly strengthens their conclusions.<br /> - Marmoset vision is fundamentally different from mouse vision as they have a fovea and make goal-directed eye movements. In this revised version of their paper, the authors acknowledge this and investigate the possible effect of eye movements and pupil size on the differences they find between running and stationary. They conclude that eye input does not explain all these differences.

      Significance

      The paper provides interesting new evidence to the ongoing discussion about the influence of non-visual factors in general, and running in particular, on visual cortex activity. As such, it helps to pull this discussion out of the rodent field mainly and into the field of primate research. The bigger question of *why* there are differences between rodents and primates remains still unanswered, but the authors do their best to provide possible explanations. The elegant experimental set-up of the marmoset on a treadmill will certainly add new findings to this issue also in the years to come.

    4. Reviewer #2 (Public Review):

      This work aims at answering whether activity in primate visual cortex is modulated by locomotion, as was reported for mouse visual cortex. The finding that the activity in mouse visual cortex is modulated by running has changed the concept of primary sensory cortical areas. However, it was an open question whether this modulation generalizes to primates.

      To answer this fundamental question the authors established a novel paradigm in which a head-fixed marmoset was able to run on a treadmill while watching a visual stimulus on a display. In addition, eye movements and running speed were monitored continuously and extracellular neuronal activity in primary visual cortex recorded using high-channel-count electrode arrays. This paradigm uniquely permitted to investigate whether locomotion modulates sensory evoked activity in visual cortex of marmoset. Moreover, to directly compare the responses in marmoset visual cortex to responses in mouse visual cortex the authors made use of a publicly-available mouse dataset from the Allen Institute. In this dataset the mouse was also running on a treadmill and observing a set of visual stimuli on a display. The authors took extra care to have the marmoset and mouse paradigms as comparable as possible.

      To characterize the visually driven activity the authors present a series of moving gratings and estimate receptive fields with sparse noise. To estimate the gain modulation by running the authors split the dataset into epochs of running and non-running which allowed them to estimate the visually evoked firing rates in both behavioral states.

      Strengths:

      The novel paradigm of head-fixed marmosets running on a treadmill while being presented with a visual stimulus is unique and ideally tailored to answering the question that the authors aimed to answer. Moreover, the authors took extra care to ensure that the paradigm in marmoset matched as closely as possible to the conditions in the mouse experiments such that the results can be directly compared. To directly compare their data the authors re-analyzed publicly available data from visual cortex of mice recorded at the Allen Institute. Such a direct comparison, and reuse of existing datasets, is another strong aspect of the work. Finally, the presented new marmoset dataset appears to be of high quality, the comparison between mouse and marmoset visual cortex is well done and the results and interpretation straightforward.

      Weaknesses:

      It is known that the locomotion gain modulation varies with layer in mouse visual cortex, with neurons in the infragranular layers expressing a diversity of modulations (Erisken et al. 2014 Current Biology). However, for the marmoset dataset the layer information was unfortunately not recorded, leaving this point open for future studies.

      Nonetheless, the aim of comparing the locomotion induced modulation of activity in primate and mouse primary visual cortex was convincingly achieved by the authors. The results shown in the figures support the conclusion that locomotion modulates the activity in primate and mouse visual cortex differently. While mice show a profound gain increase, neurons in primate visual cortex show little modulation or even a reduction in response strength.

      This work will have a strong impact on the field of visual neuroscience but also on neuroscience in general. It revives the debate of whether results obtained in the mouse model system can be simply generalized to other mammalian model systems, such as non-human primates. Based on the presented results, the comparison between the mouse and primate visual cortex is not as straightforward as previously assumed. This will likely trigger more comparative studies between mice and primates in the future, which is important and absolutely needed to advance our understanding of the mammalian brain.

      Moreover, the reported finding that neurons in primary visual cortex of marmosets do not increase their activity during running is intriguing, as it makes you wonder why neurons in the mouse visual cortex do so. The authors discuss a few ideas in the paper which can be addressed in future experiments. In this regard it is worth noting that the authors report an interesting difference between the foveal and peripheral part of the visual cortex in marmoset. It will be interesting to investigate these differences in more detail in future studies. Likewise, while running might be an important behavioral state for mice, other behavioral states might be more relevant for marmosets and do modulate the activity of primate visual cortex more profoundly. Future work could leverage the opportunities that the marmoset model system offers to reveal new insights about behavioral related modulation in the primate brain.

    5. Reviewer #3 (Public Review):

      Prior studies have shown that locomotion (e.g., running) modulates mouse V1 activity to a similar extent as visual stimuli. However, it's unclear if these findings hold in species with more specialized and advanced visual systems such as nonhuman primates. In this work, Liska et al. leverage population and single neuron analyses to investigate potential differences and similarities in how running modulates V1 activity in marmosets and mice. Specifically, they discovered that although a shared gain model could describe well the trial-to-trial variations of population-level neural activity for both species, locomotion more strongly modulated V1 population activity in mice. Furthermore, they found that at the level of individual units, marmoset V1 neurons, unlike mice V1 neurons, experience suppression of their activity during running.

      A major strength of this work is the introduction and completion of primate electrophysiology recordings during locomotion. Data of this kind were previously limited, and this work moves the field forward in terms of data collection in a domain previously inaccessible in primates. Another core strength of this work is that it adds to a limited collection of cross-species data collection and analysis of neural activity at the single-unit and population level, attempting to standardize analysis and data collection to be able to make inferences across species. In particular, the findings on how the primate peripheral and foveal V1 representations functionally relate to and differ from the mice V1 representations speak to the power of these cross-species comparisons.

      However, there are still some lingering potential extensions to this work, largely acknowledged by the authors. One of these extensions involves more detailed eye movement analysis within species, such as microsaccades in marmosets and the potential impact on marmoset V1 activity. In the mouse data, similar eye-related analyses were not possible, in part due to instability in the eye recordings of many mouse sessions that made it challenging to replicate partnered analyses for the marmosets. We agree with the authors' assessment that these analyses can be targeted in future work and still believe that the marmoset eye-movement findings provide novel insights that will inform future cross-species comparisons of the visual system. Furthermore, another important issue not fully explored is the possible effects of the reward scheme during marmoset locomotion on V1 activity. The authors note that, unlike their mice counterparts, the marmosets were encouraged to run via liquid rewards, given after subjects traversed a specific distance. While the authors discuss the changes in arousal present when marmosets were running, there are still some unanswered questions on how their reward scheme may affect biomarkers (e.g., pupil sizes) and marmoset V1 activity.

      Overall, the methods and data support the work's main claims. Single neuron and population level approaches demonstrate that the activity of V1 in mice and marmoset are categorically different. Since primate V1 is so diverse and differs from mouse V1, this presents important limitations on direct inferences from mouse V1 to primate V1. This work is a great step forward in the field, especially with the novel methodology of collecting neural activity from running primates.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      The current study provided a follow-up analysis using published datasets focused on the individual variability of both the distraction effect (size and direction) and the attribute integration style, as well as the association between the two. The authors tried to answer the question of whether the multiplicative attribute integration style concurs with a more pronounced and positively oriented distraction effect.

      Strengths:

      The analysis extensively examined the impacts of various factors on decision accuracy, with a particular focus on using two-option trials as control trials, following the approach established by Cao & Tsetsos (2022). The statistical significance results were clearly reported.

      The authors meticulously conducted supplementary examinations, incorporating the additional term HV+LV into GLM3. Furthermore, they replaced the utility function from the expected value model with values from the composite model.

      We thank the reviewer for the positive response and are pleased that the reviewer found our report interesting.

      Reviewer #1 Comment 1

      Weaknesses:

      There are several weaknesses in terms of theoretical arguments and statistical analyses.

      First, the manuscript suggests in the abstract and at the beginning of the introduction that the study reconciled the "different claims" about "whether distraction effect operates at the level of options' component attributes rather than at the level of their overall value" (see line 13-14), but the analysis conducted was not for that purpose. Integrating choice attributes in either an additive or multiplicative way only reflects individual differences in combining attributes into the overall value. The authors seemed to assume that the multiplicative way generated the overall value ("Individuals who tended to use a multiplicative approach, and hence focused on overall value", line 20-21), but such implicit assumption is at odds with the statement in line 77-79 that people may use a simpler additive rule to combine attributes, which means overall value can come from the additive rule.

      We thank the reviewer for the comment. We have made adjustments to the manuscript to ensure that the message delivered within this manuscript is consistent. Within this manuscript, our primary focus is on the different methods of value integration in which the overall value is computed (i.e., additive, multiplicative, or both), rather than the interaction at the individual level of attributes. However, we do not exclude the possibility that the distractor effect may occur at multiple levels. Nevertheless, in light of the reviewer’s comment, we agree that we should focus the argument on whether distractors facilitate or impair decision making and downplay the separate argument about the level at which distractor effects operate. We have now revised the abstract:

      “It is widely agreed that people make irrational decisions in the presence of irrelevant distractor options. However, there is little consensus on whether decision making is facilitated or impaired by the presence of a highly rewarding distractor or whether the distraction effect operates at the level of options’ component attributes rather than at the level of their overall value. To reconcile different claims, we argue that it is important to incorporate consideration of the diversity of people’s ways of decision making. We focus on a recent debate over whether people combine choice attributes in an additive or multiplicative way. Employing a multi-laboratory dataset investigating the same decision making paradigm, we demonstrated that people used a mix of both approaches and the extent to which approach was used varied across individuals. Critically, we identified that this variability was correlated with the effect of the distractor on decision making. Individuals who tended to use a multiplicative approach to compute value, showed a positive distractor effect. In contrast, in individuals who tended to use an additive approach, a negative distractor effect (divisive normalisation) was prominent. These findings suggest that the distractor effect is related to how value is constructed, which in turn may be influenced by task and subject specificities. Our work concurs with recent behavioural and neuroscience findings that multiple distractor effects co-exist.” (Lines 12-26)

      Furthermore, we acknowledge that the current description of the additive rule could be interpreted in several ways. The current additive utility model described as:

      where  is the options’ utility,  is the reward magnitude,  is the probability, and  is the magnitude/probability weighing ratio . If we perform comparison between values according to this model (i.e., HV against LV), we would arrive at the following comparison:

      If we rearrange (1), we will arrive at:

      While equations (1) and (2) are mathematically equivalent, equation (1) illustrates the interpretation where the comparison of the utilities occurs after value integration and forming an overall value. On the other hand, equation (2) can be broadly interpreted as the comparison of individual attributes in the absence of an overall value estimate for each option. Nonetheless, while we do not exclude the possibility that the distractor effect may occur at multiple levels, we have made modifications to the main manuscript employ more consistently a terminology referring to different methods of value estimation while recognizing that our empirical results are compatible with both interpretations.

      Reviewer #1 Comment 2

      The second weakness is sort of related but is more about the lack of coherent conceptual understanding of the "additive rule", or "distractor effect operates at the attribute level". In an assertive tone (lines 77-80), the manuscript suggests that a weighted sum integration procedure of implementing an "additive rule" is equal to assuming that people compare pairs of attributes separately, without integration. But they are mechanistically distinct. The additive rule (implemented using the weighted sum rule to combine probability and magnitude within each option and then applying the softmax function) assumes value exists before comparing options. In contrast, if people compare pairs of attributes separately, preference forms based on the within-attribute comparisons. Mathematically these two might be equivalent only if no extra mechanisms (such as inhibition, fluctuating attention, evidence accumulation, etc) are included in the within-attribute comparison process, which is hardly true in the three-option decision.

      We thank the reviewer for the comment. As described in our response to Reviewer #1 Comment 1, we are aware and acknowledge that there may be multiple possible interpretations of the additive rule. We also agree with the reviewer that there may be additional mechanisms that are involved in three- or even two- option decisions, but these would require additional studies to tease apart. Another motivation for the approach used here, which does not explicitly model the extra mechanisms the reviewer refers to was due to the intention of addressing and integrating findings from previous studies using the same dataset [i.e. (Cao & Tsetsos, 2022; Chau et al., 2020)]. Lastly, regardless of the mechanistic interpretation, our results show a systematic difference in the process of value estimation. Modifications to the manuscript text have been made consistent with our motivation (please refer to the reply and the textual changes proposed in response to the reviewer’s previous comment: Reviewer #1 Comment 1).

      Reviewer #1 Comment 3

      Could the authors comment on the generalizability of the current result? The reward magnitude and probability information are displayed using rectangular bars of different colors and orientations. Would that bias subjects to choose an additive rule instead of the multiplicative rule? Also, could the conclusion be extended to other decision contexts such as quality and price, whether a multiplicative rule is hard to formulate?

      We thank the reviewer for the comment. We agree with the observation that the stimulus space, with colour linearly correlated with magnitude, and orientation linearly correlated with probability, may bias subjects towards an additive rule. But that’s indeed the point: in order to maximise reward, subjects should have focused on the outcome space without being driven by the stimulus space. In practice, people are more or less successful in such endeavour. Nevertheless, we argue that the specific choice of visual stimuli we used is no more biased towards additive space than any other. In fact, as long as two or more pieces of information are provided for each option, as opposed to a single cue whose value was previously learned, there will always be a bias towards an additive heuristic (a linear combination), regardless of whether the cues are shapes, colours, graphs, numbers, words.

      As the reviewer suggested, the dataset analyzed in the current manuscript suggests that the participants were leaning towards the additive rule. Although there was a general tendency using the additive rule while choosing between the rectangular bars, we can still observe a spread of individuals using either, or both, additive and multiplicative rules, suggesting that there was indeed diversity in participants’ decision making strategies in our data.

      In previous studies, it was observed that human and non-human individuals used a mix of multiplicative and additive rules when they were tested on experimental paradigms different from ours (Bongioanni et al., 2021; Farashahi et al., 2019; Scholl et al., 2014). It was also observed that positive and negative distractor effects can be both present in the same data set when human and non-human individuals made decisions about food and social partner (Chang et al., 2019; Louie et al., 2013). It was less clear in the past whether the precise way a distractor affects decision making (i.e., positive/negative distractor effect) is related to the use of decision strategy (i.e., multiplicative/additive rules) and this is exactly what we are trying to address in this manuscript. A follow-up study looking at neural data (such as functional magnetic resonance imaging data) could provide a better understanding of the mechanistic nature of the relationship between distractor effects and decision strategy that we identified here.

      We agree with the reviewer that it is true that a multiplicative strategy may not be applicable to some decision contexts. Here it is important to look at the structure of the optimal solution (the one maximizing value in the long run). Factors modulating value (such as probability and temporal delay) require a non-linear (e.g., multiplicative solution), while factors of the cost-benefit form (such as effort and price) require a linear solution (e.g., subtraction). In the latter scenario the additive heuristic would coincide with the optimal solution, and the effect addressed in this study may not be revealed. Nonetheless, the present data supports the notion of distinct neural mechanisms at least for probabilistic decision-making, and is likely applicable to decision-making in general.

      Our findings, in conjunction with the literature, also suggest that a positive distractor effect could be a general phenomenon in decision mechanisms that involve the medial prefrontal cortex. For example, it has been shown that the positive distractor effect is related to a decision mechanism linked to medial prefrontal cortex [especially the ventromedial prefrontal cortex (Chau et al., 2014; Noonan et al., 2017)]. It is also known a similar brain region is involved not only when individuals are combining information using a multiplicative strategy (Bongioanni et al., 2021), but also when they are combining information to evaluate new experience or generalize information (Baram et al., 2021; Barron et al., 2013; Park et al., 2021). We have now revised the Discussion to explain this:

      “In contrast, the positive distractor effect is mediated by the mPFC (Chau et al., 2014; Fouragnan et al., 2019). Interestingly, the same or adjacent, interconnected mPFC regions have also been linked to the mechanisms by which representational elements are integrated into new representations (Barron et al., 2013; Klein-Flügge et al., 2022; Law et al., 2023; Papageorgiou et al., 2017; Schwartenbeck et al., 2023). In a number of situations, such as multi-attribute decision making, understanding social relations, and abstract knowledge, the mPFC achieves this by using a spatial map representation characterised by a grid-like response (Constantinescu et al., 2016; Bongioanni et al., 2021; Park et al., 2021) and disrupting mPFC leads to the evaluation of composite choice options as linear functions of their components (Bongioanni et al., 2021). These observations suggest a potential link between positive distractor effects and mechanisms for evaluating multiple component options and this is consistent with the across-participant correlation that we observed between the strength of the positive distractor effect and the strength of non-additive (i.e., multiplicative) evaluation of the composite stimuli we used in the current task. Hence, one direction for model development may involve incorporating the ideas that people vary in their ways of combining choice attributes and each way is susceptible to different types of distractor effect.” (Lines 260-274)

      Reviewer #1 Comment 4

      The authors did careful analyses on quantifying the "distractor effect". While I fully agree that it is important to use the matched two-option trials and examine the interaction terms (DV-HV)T as a control, the interpretation of the results becomes tricky when looking at the effects in each trial type. Figure 2c shows a positive DV-HV effect in two-option trials whereas the DV-HV effect was not significantly stronger in three-option trials. Further in Figure 5b,c, in the Multiplicative group, the effect of DV-HV was absent in the two-option trials and present in the three-option trials. In the Additive group, however, the effect of DV-HV was significantly positive in the two-option trials but was significantly lowered in the three-option trials. Hence, it seems the different distractor effects were driven by the different effects of DV-HV in the two-option trials, rather than the three-option trials?

      We thank the reviewer for the comment. While it may be a bit more difficult to interpret, the current method of examining the (DV−HV)T term rather than (DV−HV) term was used because it was the approach used in a previous study (Cao & Tsetsos, 2022).

      During the design of the original experiments, trials were generated pseudo-randomly until the DV was sufficiently decorrelated from HV−LV. While this method allows for better group-level examination of behaviour, Cao and Tsetsos were concerned that this approach may have introduced unintended confounding covariations to some trials. In theory, one of the unintended covariations could occur between the DV and specific sets of reward magnitude and probability of the HV and LV. The covariation between parameters can lead to an observable positive distractor effect in the DV−HV as a consequence of the attraction effect or an unintended byproduct of using an additive method of integrating attributes [for further elaboration, please refer to Figure 1 in (Cao & Tsetsos, 2022)]. While it may have some limitations, the approach suggested by Cao and Tsetsos has the advantage of leveraging the DV−HV term to absorb any variance contributed by possible confounding factors such that true distractor effects, if any, can be detected using the (DV−HV)T term.

      Reviewer #1 Comment 5

      Note that the pattern described above was different in Supplementary Figure 2, where the effect of DV-HV on the two-option trials was negative for both Multiplicative and Additive groups. I would suggest considering using Supplementary Figure 2 as the main result instead of Figure 5, as it does not rely on multiplicative EV to measure the distraction effect, and it shows the same direction of DV-HV effect on two-option trials, providing a better basis to interpret the (DV-HV)T effect.

      We thank the reviewer for the comments and suggestion. However, as mentioned in the response to Reviewer #1 Comment 4, the current method of analysis adopted in the manuscript and the interpretation of only (DV−HV)T is aimed to address the possibility that the (DV−HV) term may be capturing some confounding effects due to covariation. Given that the debate that is addressed specifically concerns the (DV−HV)T term, we elected to display Figure 5 within the main text and keep the results of the regression after replacing the utility function with the composite model as Supplementary Figure 5 (previously labelled as Supplementary Figure 2).

      Reviewer #2 (Public Review):

      This paper addresses the empirical demonstration of "distractor effects" in multi-attribute decision-making. It continues a debate in the literature on the presence (or not) of these effects, which domains they arise in, and their heterogeneity across subjects. The domain of the study is a particular type of multi-attribute decision-making: choices over risky lotteries. The paper reports a re-analysis of lottery data from multiple experiments run previously by the authors and other laboratories involved in the debate.

      Methodologically, the analysis assumes a number of simple forms for how attributes are aggregated (adaptively, multiplicatively, or both) and then applies a "reduced form" logistic regression to the choices with a number of interaction terms intended to control for various features of the choice set. One of these interactions, modulated by ternary/binary treatment, is interpreted as a "distractor effect."

      The claimed contribution of the re-analysis is to demonstrate a correlation in the strength/sign of this treatment effect with another estimated parameter: the relative mixture of additive/multiplicative preferences.

      We thank the reviewer for the positive response and are pleased that the reviewer found our report interesting.

      Reviewer #2 Comment 1

      Major Issues

      (1) How to Interpret GLM 1 and 2

      This paper, and others before it, have used a binary logistic regression with a number of interaction terms to attempt to control for various features of the choice set and how they influence choice. It is important to recognize that this modelling approach is not derived from a theoretical claim about the form of the computational model that guides decision-making in this task, nor an explicit test for a distractor effect. This can be seen most clearly in the equations after line 321 and its corresponding log-likelihood after 354, which contain no parameter or test for "distractor effects". Rather the computational model assumes a binary choice probability and then shoehorns the test for distractor effects via a binary/ternary treatment interaction in a separate regression (GLM 1 and 2). This approach has already led to multiple misinterpretations in the literature (see Cao & Tsetsos, 2022; Webb et al., 2020). One of these misinterpretations occurred in the datasets the authors studied, in which the lottery stimuli contained a confound with the interaction that Chau et al., (2014) were interpreting as a distractor effect (GLM 1). Cao & Tsetsos (2022) demonstrated that the interaction was significant in binary choice data from the study, therefore it can not be caused by a third alternative. This paper attempts to address this issue with a further interaction with the binary/ternary treatment (GLM 2). Therefore the difference in the interaction across the two conditions is claimed to now be the distractor effect. The validity of this claim brings us to what exactly is meant by a "distractor effect."

      The paper begins by noting that "Rationally, choices ought to be unaffected by distractors" (line 33). This is not true. There are many normative models that allow for the value of alternatives (even low-valued "distractors") to influence choices, including a simple random utility model. Since Luce (1959), it has been known that the axiom of "Independence of Irrelevant Alternatives" (that the probability ratio between any two alternatives does not depend on a third) is an extremely strong axiom, and only a sufficiency axiom for a random utility representation (Block and Marschak, 1959). It is not a necessary condition of a utility representation, and if this is our definition of rational (which is highly debatable), not necessary for it either. Countless empirical studies have demonstrated that IIA is falsified, and a large number of models can address it, including a simple random utility model with independent normal errors (i.e. a multivariate Probit model). In fact, it is only the multinomial Logit model that imposes IIA. It is also why so much attention is paid to the asymmetric dominance effect, which is a violation of a necessary condition for random utility (the Regularity axiom).

      So what do the authors even mean by a "distractor effect." It is true that the form of IIA violations (i.e. their path through the probability simplex as the low-option varies) tells us something about the computational model underlying choice (after all, different models will predict different patterns). However we do not know how the interaction terms in the binary logit regression relate to the pattern of the violations because there is no formal theory that relates them. Any test for relative value coding is a joint test of the computational model and the form of the stochastic component (Webb et al, 2020). These interaction terms may simply be picking up substitution patterns that can be easily reconciled with some form of random utility. While we can not check all forms of random utility in these datasets (because the class of such models is large), this paper doesn't even rule any of these models out.

      We thank the reviewer for the comment. In this study, one objective is to address an issue raised by Cao and Tsetsos (2022), suggesting that the distractor effect claimed in the Chau et al. (2014) study was potentially confounded by unintended correlation introduced between the distractor and the chooseable options. They suggested that this could be tested by analyzing the control binary trials and the experimental ternary trials in a single model (i.e., GLM2) and introducing an interaction term (DV−HV)T. The interaction term can partial out any unintended confound and test the distractor effect that was present specifically in the experimental ternary trials. We adopted these procedures in our current studies and employed the interaction term to test the distractor effects. The results showed that overall there was no significant distractor effect in the group. We agree with the reviewer’s comment that if we were only analysing the ternary trials, a multinomial probit model would be suitable because it allows noise correlation between the choices. Alternatively, had a multinomial logistic model been applied, a Hausman-McFadden Test could be run to test whether the data violates the assumption of independence of irrelevant alternatives (IIA). However, in our case, a binomial model is preferred over a multinomial model because of: (1) the inclusion of the binary trials, and (2) the small number of trials in which the distractor was chosen (the median was 4% of all ternary trials).

      However, another main objective of this study is to consider the possibility that the precise distractor effect may vary across individuals. This is exactly why we employed the composite model to estimate individual’s decision making strategy and investigated how that varied with the precise way the distractor influenced decision making.

      In addition, we think that the reviewer here is raising a profound point and one with which we are in sympathy; it is true that random noise utility models can predict deviations from the IIA axiom. Central to these approaches is the notion that the representations of the values of choice options are noisy. Thus, when the representation is accessed, it might have a certain value on average but this value might vary from occasion to occasion as if each sample were being drawn from a distribution. As a consequence, the value of a distractor that is “drawn” during a decision between two other options may be larger than the distractor’s average value and may even have a value that is larger than the value drawn from the less valuable choice option’s distribution on the current trial. On such a trial it may become especially clear that the better of the two options has a higher value than the alternative choice option. Our understanding is that Webb, Louie and colleagues (Louie et al., 2013; Webb et al., 2020) suggest an explanation approximately along these lines when they reported a negative distractor effect during some decisions, i.e., they follow the predictions of divisive normalization suggesting that decisions become more random as the distractor’s value is greater.

      An alternative approach, however, assumes that rather than noise in the representation of the option itself, there is noise in the comparison process when the two options are compared. This is exemplified in many influential decision making models including evidence accumulation models such as drift diffusion models (Shadlen & Shohamy, 2016) and recurrent neural network models of decision making (Wang, 2008). It is this latter type of model that we have used in our previous investigations (Chau et al., 2020; Kohl et al., 2023). However, these two approaches are linked both in their theoretical origin and in the predictions that they make in many situations (Shadlen & Shohamy, 2016). We therefore clarify that this is the case in the revised manuscript as follows:

      “In the current study and in previous work we have used or made reference to models of decision making that assume that a noisy process of choice comparison occurs such as recurrent neural networks and drift diffusion models (Shadlen & Shohamy, 2016; Wang, 2008). Under this approach, positive distractor effects are predicted when the comparison process becomes more accurate because of an impact on the noisy process of choice comparison (Chau et al., 2020; Kohl et al., 2023). However, it is worth noting that another class of models might assume that a choice representation itself is inherently noisy. According to this approach, on any given decision a sample is drawn from a distribution of value estimates in a noisy representation of the option. Thus, when the representation is accessed, it might have a certain value on average but this value might vary from occasion to occasion. As a consequence, the value of a distractor that is “drawn” during decision between two other options may be larger than the distractor’s average value and may even have a value that is larger than the value drawn from the less valuable choice option’s distribution on the current trial. On such a trial it may become especially clear that the better of the two options has a higher value than the alternative choice option. Louie and colleagues (Louie et al., 2013) suggest an explanation approximately along these lines when they reported a positive distractor effect during some decisions. Such different approaches share theoretical origins (Shadlen & Shohamy, 2016) and make related predictions about the impact of distractors on decision making.” (Lines 297-313)

      Reviewer #2 Comment 2

      (2) How to Interpret the Composite (Mixture) model?

      On the other side of the correlation are the results from the mixture model for how decision-makers aggregate attributes. The authors report that most subjects are best represented by a mixture of additive and multiplicative aggregation models. The authors justify this with the proposal that these values are computed in different brain regions and then aggregated (which is reasonable, though raises the question of "where" if not the mPFC). However, an equally reasonable interpretation is that the improved fit of the mixture model simply reflects a misspecification of two extreme aggregation processes (additive and EV), so the log-likelihood is maximized at some point in between them.

      One possibility is a model with utility curvature. How much of this result is just due to curvature in valuation? There are many reasonable theories for why we should expect curvature in utility for human subjects (for example, limited perception: Robson, 2001, Khaw, Li Woodford, 2019; Netzer et al., 2022) and of course many empirical demonstrations of risk aversion for small stakes lotteries. The mixture model, on the other hand, has parametric flexibility.

      There is also a large literature on testing expected utility jointly with stochastic choice, and the impact of these assumptions on parameter interpretation (Loomes & Sugden, 1998; Apesteguia & Ballester, 2018; Webb, 2019). This relates back to the point above: the mixture may reflect the joint assumption of how choice departs from deterministic EV.

      We thank the reviewer for the comment. They are indeed right to mention the vast literature on curvature in subjective valuation; however it is important to stress that the predictions of the additive model with linear basis functions are quite distinct for the predictions of a multiplicative model with non-linear basis functions. We have tested the possibility that participants’ behaviour was better explained by the latter and we showed that this was not the case. Specifically, we have added and performed model fitting on an additional model with utility curvature based on prospect theory (Kahneman & Tversky, 1979) with the weighted probability function suggested by (Prelec, 1998):

      where  and  represent the reward magnitude and probability (both rescaled to the interval between 0 and 1), respectively.  is the weighted magnitude and  is the weighted probability, while  and  are the corresponding distortion parameters. This prospect theory (PT) model is included along with the four previous models (please refer to Figure 3) in a Bayesian model comparison. Results indicate that the composite model remains the best account of participants’ choice behaviour (exceedance probability = 1.000, estimated model frequency = 0.720). We have now included these results in the main text and Supplementary Figure 2:

      “Supplementary Figure 2 reports an additional Bayesian model comparison performed while including a model with nonlinear utility functions based on Prospect Theory (Kahneman & Tversky, 1979) with the Prelec formula for probability (Prelec, 1998). Consistent with the above finding, the composite model provides the best account of participants’ choice behaviour (exceedance probability = 1.000, estimated model frequency = 0.720).” (Lines 193-198)

      Reviewer #2 Comment 3

      3) So then how should we interpret the correlation that the authors report?

      On one side we have the impact of the binary/ternary treatment which demonstrates some impact of the low value alternative on a binary choice probability. This may reflect some deep flaws in existing theories of choice, or it may simply reflect some departure from purely deterministic expected value maximization that existing theories can address. We have no theory to connect it to, so we cannot tell. On the other side of the correlation, we have a mixture between additive and multiplicative preferences over risk. This result may reflect two distinct neural processes at work, or it may simply reflect a misspecification of the manner in which humans perceive and aggregate attributes of a lottery (or even just the stimuli in this experiment) by these two extreme candidates (additive vs. EV). Again, this would entail some departure from purely deterministic expected value maximization that existing theories can address.

      It is entirely possible that the authors are reporting a result that points to the more exciting of these two possibilities. But it is also possible (and perhaps more likely) that the correlation is more mundane. The paper does not guide us to theories that predict such a correlation, nor reject any existing ones. In my opinion, we should be striving for theoretically-driven analyses of datasets, where the interpretation of results is clearer.

      We thank the reviewer for their clear comments. Based on our responses to the previous comments it should be apparent that our results are consistent with several existing theories of choice, so we are not claiming that there are deep flaws in them, but distinct neural processes (additive and multiplicative) are revealed, and this does not reflect a misspecification in the modelling. We have revised our manuscript in the light of the reviewer’s comments in the hope of clarifying the theoretical background which informed both our data analysis and our data interpretation.

      First, we note that there are theoretical reasons to expect a third option might impact on choice valuation. There is a large body of work suggesting that a third option may have an impact on the values of two other options (indeed Reviewer #2 refers to some of this work in their Reviewer #2 Comment 1), but the body of theoretical work originates partly in neuroscience and not just in behavioural economics. In many sensory systems, neural activity changes with the intensity of the stimuli that are sensed. Divisive normalization in sensory systems, however, describes the way in which such neural responses are altered also as a function of other adjacent stimuli (Carandini & Heeger, 2012; Glimcher, 2022; Louie et al., 2011, 2013). The phenomenon has been observed at neural and behavioural levels as a function not just of the physical intensity of the other stimuli but as a function of their associated value (Glimcher, 2014, 2022; Louie et al., 2011, 2015; Noonan et al., 2017; Webb et al., 2020).

      Analogously there is an emerging body of work on the combinatorial processes that describe how multiple representational elements are integrated into new representations (Barron et al., 2013; Papageorgiou et al., 2017; Schwartenbeck et al., 2023). These studies have originated in neuroscience, just as was the case with divisive normalization, but they may have implications for understanding behaviour. For example, they might be linked to behavioural observations that the values assigned to bundles of goods are not necessarily the sum of the values of the individual goods (Hsee, 1998; List, 2002). One neuroscience fact that we know about such processes is that, at an anatomical level, they are linked to the medial frontal cortex (Barron et al., 2013; Fellows, 2006; Hunt et al., 2012; Papageorgiou et al., 2017; Schwartenbeck et al., 2023). A second neuroscientific fact that we know about medial frontal cortex is that it is linked to any positive effects that distractors might have on decision making (Chau et al., 2014; Noonan et al., 2017). Therefore, we might make use of these neuroscientific facts and theories to predict a correlation between positive distractor effects and non-additive mechanisms for determining the integrated value of multi-component choices. This is precisely what we did; we predicted the correlation on the basis of this body of work and when we tested to see if it was present, we found that indeed it was. It may be the case that other behavioural economics theories offer little explanation of the associations and correlations that we find. However, we emphasize that this association is predicted by neuroscientific theory and in the revised manuscript we have attempted to clarify this in the Introduction and Discussion sections:

      “Given the overlap in neuroanatomical bases underlying the different methods of value estimation and the types of distractor effects, we further explored the relationship. Critically, those who employed a more multiplicative style of integrating choice attributes also showed stronger positive distractor effects, whereas those who employed a more additive style showed negative distractor effects. These findings concur with neural data demonstrating that the medial prefrontal cortex (mPFC) computes the overall values of choices in ways that go beyond simply adding their components together, and is the neural site at which positive distractor effects emerge (Barron et al., 2013; Bongioanni et al., 2021; Chau et al., 2014; Fouragnan et al., 2019; Noonan et al., 2017; Papageorgiou et al., 2017), while divisive normalization was previously identified in the posterior parietal cortex (PPC) (Chau et al., 2014; Louie et al., 2011).” (Lines 109-119)

      “At the neuroanatomical level, the negative distractor effect is mediated by the PPC, where signal modulation described by divisive normalization has been previously identified (Chau et al., 2014; Louie et al., 2011). The same region is also crucial for perceptual decision making processes (Shadlen & Shohamy, 2016). The additive heuristics for combining choice attributes are closer to a perceptual evaluation because distances in this subjective value space correspond linearly to differences in physical attributes of the stimuli, whereas normative (multiplicative) value has a non-linear relation with them (cf. Figure 1c). It is well understood that many sensory mechanisms, such as in primates’ visual systems or fruit flies’ olfactory systems, are subject to divisive normalization (Carandini & Heeger, 2012). Hence, the additive heuristics that are more closely based on sensory mechanisms could also be subject to divisive normalization, leading to negative distractor effects in decision making.

      In contrast, the positive distractor effect is mediated by the mPFC (Chau et al., 2014; Fouragnan et al., 2019). Interestingly, the same or adjacent, interconnected mPFC regions have also been linked to the mechanisms by which representational elements are integrated into new representations (Barron et al., 2013; Klein-Flügge et al., 2022; Law et al., 2023; Papageorgiou et al., 2017; Schwartenbeck et al., 2023). In a number of situations, such as multi-attribute decision making, understanding social relations, and abstract knowledge, the mPFC achieves this by using a spatial map representation characterised by a grid-like response (Constantinescu et al., 2016; Bongioanni et al., 2021; Park et al., 2021) and disrupting mPFC leads to the evaluation of composite choice options as linear functions of their components (Bongioanni et al., 2021). These observations suggest a potential link between positive distractor effects and mechanisms for evaluating multiple component options and this is consistent with the across-participant correlation that we observed between the strength of the positive distractor effect and the strength of non-additive (i.e., multiplicative) evaluation of the composite stimuli we used in the current task. Hence, one direction for model development may involve incorporating the ideas that people vary in their ways of combining choice attributes and each way is susceptible to different types of distractor effect.” (Lines 250-274)

      Reviewer #2 Comment 4

      (4) Finally, the results from these experiments might not have external validity for two reasons. First, the normative criterion for multi-attribute decision-making differs depending on whether the attributes are lotteries or not (i.e. multiplicative vs additive). Whether it does so for humans is a matter of debate. Therefore if the result is unique to lotteries, it might not be robust for multi-attribute choice more generally. The paper largely glosses over this difference and mixes literature from both domains. Second, the lottery information was presented visually and there is literature suggesting this form of presentation might differ from numerical attributes. Which is more ecologically valid is also a matter of debate.

      We thank the reviewer for the comment. Indeed, they are right that the correlation we find between value estimation style and distractor effects may not be detected in all contexts of human behaviour. What the reviewer suggests goes along the same lines as our response to Reviewer #1 Comment 3, multi-attribute value estimation may have different structure: in some cases, the optimal solution may require a non-linear (e.g., multiplicative) response as in probabilistic or delayed decisions, but other cases (e.g., when estimating the value of a snack based on its taste, size, healthiness, price) a linear integration would suffice. In the latter kind of scenarios, both the optimal and the heuristic solutions may be additive and people’s value estimation “style” may not be teased apart. However, if different neural mechanisms associated with difference estimation processes are observed in certain scenarios, it suggests that these mechanisms are always present, even in scenarios where they do not alter the predictions. Probabilistic decision-making is also pervasive in many aspects of daily life and not just limited to the case of lotteries.

      While behaviour has been found to differ depending on whether lottery information is presented graphically or numerically, there is insufficient evidence to suggest biases towards additive or multiplicative evaluation, or towards positive or negative distractor effects. As such, we may expect that the correlation that we reveal in this paper, grounded in distinct neural mechanisms, would still hold even under different circumstances.

      Taking previous literature as examples, similar patterns of behaviour have been observed in humans when making decisions during trinary choice tasks. In a study conducted by Louie and colleagues (Louie et al., 2013; Webb et al., 2020), human participants performed a snack choice task where their behaviour could be modelled by divisive normalization with biphasic response (i.e., both positive and negative distractor effects). While these two studies only use a single numerical value of price for behavioural modelling, these prices should originate from an internal computation of various attributes related to each snack that are not purely related to lotteries. Expanding towards the social domain, studies of trinary decision making have considered face attractiveness and averageness (Furl, 2016), desirability of hiring (Chang et al., 2019), as well as desirability of candidates during voting (Chang et al., 2019). These choices involve considering various attributes unrelated to lotteries or numbers and yet, still display a combination of positive distractor and negative distractor (i.e. divisive normalization) effects, as in the current study. In particular, the experiments carried out by Chang and colleagues (Chang et al., 2019) involved decisions in a social context that resemble real-world situations. These findings suggests that both types of distractor effects can co-exist in other value based decision making tasks (Li et al., 2018; Louie et al., 2013) as well as decision making tasks in social contexts (Chang et al., 2019; Furl, 2016).

      Reviewer #2 Comment 5

      Minor Issues:

      The definition of EV as a normative choice baseline is problematic. The analysis requires that EV is the normative choice model (this is why the HV-LV gap is analyzed and the distractor effect defined in relation to it). But if the binary/ternary interaction effect can be accounted for by curvature of a value function, this should also change the definition of which lottery is HV or LV for that subject!

      We thank the reviewer for the comment. While the initial part of the paper discussed results that were defined by the EV model, the results shown in Supplementary Figure 2 were generated by replacing the utility function based on values obtained by using the composite model. Here, we have also redefined the definition of HV or LV for each subject depending on the updated value generated by the composite model prior to the regression.

      References

      Apesteguia, J. & Ballester, M. Monotone stochastic choice models: The case of risk and time preferences. Journal of Political Economy (2018).

      Block, H. D. & Marschak, J. Random Orderings and Stochastic Theories of Responses. Cowles Foundation Discussion Papers (1959).

      Khaw, M. W., Li, Z. & Woodford, M. Cognitive Imprecision and Small-Stakes Risk Aversion. Rev. Econ. Stud. 88, 1979-2013 (2020).

      Loomes, G. & Sugden, R. Testing Different Stochastic Specificationsof Risky Choice. Economica 65, 581-598 (1998).

      Luce, R. D. Indvidual Choice Behaviour. (John Wiley and Sons, Inc., 1959).

      Netzer, N., Robson, A. J., Steiner, J. & Kocourek, P. Endogenous Risk Attitudes. SSRN Electron. J. (2022) doi:10.2139/ssrn.4024773.

      Robson, A. J. Why would nature give individuals utility functions? Journal of Political Economy 109, 900-914 (2001).

      Webb, R. The (Neural) Dynamics of Stochastic Choice. Manage Sci 65, 230-255 (2019).

      Reviewer #3 (Public Review):

      Summary:

      The way an unavailable (distractor) alternative impacts decision quality is of great theoretical importance. Previous work, led by some of the authors of this study, had converged on a nuanced conclusion wherein the distractor can both improve (positive distractor effect) and reduce (negative distractor effect) decision quality, contingent upon the difficulty of the decision problem. In very recent work, Cao and Tsetsos (2022) reanalyzed all relevant previous datasets and showed that once distractor trials are referenced to binary trials (in which the distractor alternative is not shown to participants), distractor effects are absent. Cao and Tsetsos further showed that human participants heavily relied on additive (and not multiplicative) integration of rewards and probabilities.

      The present study by Wong et al. puts forward a novel thesis according to which interindividual differences in the way of combining reward attributes underlie the absence of detectable distractor effect at the group level. They re-analysed the 144 human participants and classified participants into a "multiplicative integration" group and an "additive integration" group based on a model parameter, the "integration coefficient", that interpolates between the multiplicative utility and the additive utility in a mixture model. They report that participants in the "multiplicative" group show a negative distractor effect while participants in the "additive" group show a positive distractor effect. These findings are extensively discussed in relation to the potential underlying neural mechanisms.

      Strengths:

      - The study is forward-looking, integrating previous findings well, and offering a novel proposal on how different integration strategies can lead to different choice biases.

      - The authors did an excellent job of connecting their thesis with previous neural findings. This is a very encompassing perspective that is likely to motivate new studies towards a better understanding of how humans and other animals integrate information in decisions under risk and uncertainty.

      - Despite that some aspects of the paper are very technical, methodological details are well explained and the paper is very well written.

      We thank the reviewer for the positive response and are pleased that the reviewer found our report interesting.

      Reviewer #3 Comment 1

      Weaknesses:

      The authors quantify the distractor variable as "DV - HV", i.e., the relative distractor variable. Do the conclusions hold when the distractor is quantified in absolute terms (as "DV", see also Cao & Tsetsos, 2023)? Similarly, the authors show in Suppl. Figure 1 that the inclusion of a HV + LV regressor does not alter their conclusions. However, the (HV + LV)*T regressor was not included in this analysis. Does including this interaction term alter the conclusions considering there is a high correlation between (HV + LV)*T and (DV - HV)*T? More generally, it will be valuable if the authors assess and discuss the robustness of their findings across different ways of quantifying the distractor effect.

      We thank the reviewer for the comment. In the original manuscript we had already demonstrated that the distractor effect was related to the integration coefficient using a number of complementary analyses. They include Figure 5 based on GLM2, Supplementary Figure 3 based on GLM3 (i.e., adding the HV+LV term to GLM2), and Supplementary Figure 4 based on GLM2 but applying the utility estimate from the composite model instead of expected value (EV). These three sets of analyses produced comparable results. The reason why we elected not to include the (HV+LV)T term in GLM3 (Supplementary Figure 3) was due to the collinearity between the regressors in the GLM. If this term is included in GLM3, the variance inflation factor (VIF) would exceed an acceptable level of 4 for some regressors. In particular, the VIF for the (HV+LV) and (HV+LV)T regressors is 5.420, while the VIF for (DV−HV) and (DV−HV)T is 4.723.

      Here, however, we consider the additional analysis suggested by the reviewer and test whether similar results are obtained. We constructed GLM4 including the (HV+LV)T term but replacing the relative distractor value (DV-HV) with the absolute distractor value (DV) in the main term and its interactions, as follows:

      GLM4:

      A significant negative (DV)T effect was found for the additive group [t(72)=−2.0253, p=0.0465] while the multiplicative group had a positive trend despite not reaching significance. Between the two groups, the (DV)T term was significantly different [t(142)=2.0434, p=0.0429]. While these findings suggest that the current conclusions could be partially replicated, simply replacing the relative distractor value with the absolute value in the previous analyses resulted in non-significant findings. Taking these results together with the main findings, it is possible to conclude that the positive distractor effect is better captured using the relative DV-HV term rather than the absolute DV term. This would be consistent with the way in which option values are envisaged to interact with one another in the mutual inhibition model (Chau et al., 2014, 2020) that generates the positive distractor effect. The model suggests that evidence is accumulated as the difference between the excitatory input from the option (e.g. the HV option) and the pooled inhibition contributed partly by the distractor. We have now included these results in the manuscript:

      “Finally, we performed three additional analyses that revealed comparable results to those shown in Figure 5. In the first analysis, reported in Supplementary Figure 3, we added an  term to the GLM, because this term was included in some analyses of a previous study that used the same dataset (Chau et al., 2020). In the second analysis, we added an  term to the GLM. We noticed that this change led to inflation of the collinearity between the regressors and so we also replaced the (DV−HV) term by the DV term to mitigate the collinearity (Supplementary Figure 4). In the third analyses, reported in Supplementary Figure 5, we replaced the utility terms of GLM2. Since the above analyses involved using HV, LV, and DV values defined by the normative Expected Value model, here, we re-defined the values using the composite model prior to applying GLM2. Overall, in the Multiplicative Group a significant positive distractor effect was found in Supplementary Figures 3 and 4. In the Additive Group a significant negative distractor effect was found in Supplementary Figures 3 and 5. Crucially, all three analyses consistently showed that the distractor effects were significantly different between the Multiplicative Group and the Additive Group.” (Lines 225-237)

      Reviewer #3 Comment 2

      The central finding of this study is that participants who integrate reward attributes multiplicatively show a positive distractor effect while participants who integrate additively show a negative distractor effect. This is a very interesting and intriguing observation. However, there is no explanation as to why the integration strategy covaries with the direction of the distractor effect. It is unlikely that the mixture model generates any distractor effect as it combines two "context-independent" models (additive utility and expected value) and is fit to the binary-choice trials. The authors can verify this point by quantifying the distractor effect in the mixture model. If that is the case, it will be important to highlight that the composite model is not explanatory; and defer a mechanistic explanation of this covariation pattern to future studies.

      We thank the reviewer for the comment. Indeed, the main purpose of applying the mixture model was to identify the way each participants combined attributes and, as the reviewer pointed out, the mixture model per se is context independent. While we acknowledge that the mixture model is not a mechanistic explanation, there is a theoretical basis for the observation that these two factors are linked.

      Firstly, studies that have examined the processes involved when humans combine and integrate different elements to form new representations (Barron et al., 2013; Papageorgiou et al., 2017; Schwartenbeck et al., 2023) have implicated the medial frontal cortex as a crucial region (Barron et al., 2013; Fellows, 2006; Hunt et al., 2012; Papageorgiou et al., 2017; Schwartenbeck et al., 2023). Meanwhile, previous studies have also identified that positive distractor effects are linked to the medial frontal cortex (Chau et al., 2014; Noonan et al., 2017). Therefore, the current study utilized these two facts to establish the basis for a correlation between positive distractor effects and non-additive mechanisms for determining the integrated value of multi-component choices. Nevertheless, we agree with the reviewer that it will be an important future direction to look at how the covariation pattern emerges in a computational model. We have revised the manuscript in an attempt to address this issue.

      “At the neuroanatomical level, the negative distractor effect is mediated by the PPC, where signal modulation described by divisive normalization has been previously identified (Chau et al., 2014; Louie et al., 2011). The same region is also crucial for perceptual decision making processes (Shadlen & Shohamy, 2016). The additive heuristics for combining choice attributes are closer to a perceptual evaluation because distances in this subjective value space correspond linearly to differences in physical attributes of the stimuli, whereas normative (multiplicative) value has a non-linear relation with them (cf. Figure 1c). It is well understood that many sensory mechanisms, such as in primates’ visual systems or fruit flies’ olfactory systems, are subject to divisive normalization (Carandini & Heeger, 2012). Hence, the additive heuristics that are more closely based on sensory mechanisms could also be subject to divisive normalization, leading to negative distractor effects in decision making.

      In contrast, the positive distractor effect is mediated by the mPFC (Chau et al., 2014; Fouragnan et al., 2019). Interestingly, the same or adjacent, interconnected mPFC regions have also been linked to the mechanisms by which representational elements are integrated into new representations (Barron et al., 2013; Klein-Flügge et al., 2022; Law et al., 2023; Papageorgiou et al., 2017; Schwartenbeck et al., 2023). In a number of situations, such as multi-attribute decision making, understanding social relations, and abstract knowledge, the mPFC achieves this by using a spatial map representation characterised by a grid-like response (Constantinescu et al., 2016; Bongioanni et al., 2021; Park et al., 2021) and disrupting mPFC leads to the evaluation of composite choice options as linear functions of their components (Bongioanni et al., 2021). These observations suggest a potential link between positive distractor effects and mechanisms for evaluating multiple component options and this is consistent with the across-participant correlation that we observed between the strength of the positive distractor effect and the strength of non-additive (i.e., multiplicative) evaluation of the composite stimuli we used in the current task. Hence, one direction for model development may involve incorporating the ideas that people vary in their ways of combining choice attributes and each way is susceptible to different types of distractor effect.” (Lines 250-274)

      Reviewer #3 Comment 3

      -  Correction for multiple comparisons (e.g., Bonferroni-Holm) was not applied to the regression results. Is the "negative distractor effect in the Additive Group" (Fig. 5c) still significant after such correction? Although this does not affect the stark difference between the distractor effects in the two groups (Fig. 5a), the classification of the distractor effect in each group is important (i.e., should future modelling work try to capture both a negative and a positive effect in the two integration groups? Or just a null and a positive effect?).

      We thank the reviewer for the comment. We have performed Bonferroni-Holm correction and as the reviewer surmised, the negative distractor effect in the additive group becomes non-significant. However, we have to emphasize that our major claim is that there was a covariation between decision strategy (of combining attributes) and distractor effect (as seen in Figure 4). That analysis does not imply multiple comparisons. The analysis in Figure 5 that splits participants into two groups was mainly designed to illustrate the effects for an easier understanding by a more general audience. In many cases, the precise ways in which participants are divided into subgroups can have a major impact on whether each individual group’s effects are significant or not. It may be possible to identify an optimal way of grouping, but we refrained from taking such a trial-and-error approach, especially for the analysis in Figure 5 that simply supplements the point made in Figure 4. The key notion we would like the readers to take away is that there is a spectrum of distractor effects (ranging from negative to positive) that will vary depending on how the choice attributes were integrated.

      Reviewer #1 (Recommendations For The Authors):

      Reviewer #1 Recommendations 1

      Enhancements are necessary for the quality of the scientific writing. Several sentences have been written in a negligent manner and warrant revision to ensure a higher level of rigor. Moreover, a number of sentences lack appropriate citations, including but not restricted to:

      - Line 39-41.

      - Line 349-350 (also please clarify what it means by parameter estimate" is very accurate: correlation?).

      We thank the reviewer for the comment. We have made revisions to various parts of the manuscript to address the reviewer’s concerns.

      “Intriguingly, most investigations have considered the interaction between distractors and chooseable options either at the level of their overall utility or at the level of their component attributes, but not both (Chau et al., 2014, 2020; Gluth et al., 2018).” (Lines 40-42)

      “Additional simulations have shown that the fitted parameters can be recovered with high accuracy (i.e., with a high correlation between generative and recovered parameters).” (Lines 414-416)

      Reviewer #1 Recommendations 2

      Some other minor suggestions:

      - Correlative vs. Causality: the manuscript exhibits a lack of attentiveness in drawing causal conclusions from correlative evidence (manuscript title, Line 91, Line 153-155).

      - When displaying effect size on accuracy, there is no need to show the significance of intercept (Figure 2,5, & supplementary figures).

      - Adding some figure titles on Figure 2 so it is clear what each panel stands for.

      - In Figure 3, the dots falling on zero values are not easily seen. Maybe increasing the dot size a little?

      - Line 298: binomial linking function (instead of binomial distribution).

      - Line 100: composite, not compositive.

      - Line 138-139: please improve the sentence, if it's consistent with previous findings, what's the point of "surprisingly"?

      We thank the reviewer for the suggestions. We have made revisions to the title and various parts of the manuscript to address the reviewer’s concerns.

      - Correlative vs. Causality: the manuscript exhibits a lack of attentiveness in drawing causal conclusions from correlative evidence (manuscript title, Line 91, Line 153-155).

      We have now revised the manuscript:

      “Distractor effects in decision making are related to the individual’s style of integrating choice attributes” (title of the manuscript)

      “More particularly, we consider whether individual differences in combination styles could be related to different forms of distractor effect.” (Lines 99-100)

      “While these results may seem to suggest that a distractor effect was not present at an overall group level, we argue that the precise way in which a distractor affects decision making is related to how individuals integrate the attributes.” (Lines 164-167)

      - When displaying effect size on accuracy, there is no need to show the significance of intercept (Figure 2,5, & supplementary figures).

      We have also modified all Figures to remove the intercept.

      - Adding some figure titles on Figure 2 so it is clear what each panel stands for.

      We have added titles accordingly.

      - In Figure 3, the dots falling on zero values are not easily seen. Maybe increasing the dot size a little?

      In conjunction with addressing Reviewer #3 Recommendation 6, we have adapted the violin plots into histograms for a better representation of the values.

      - Line 298: binomial linking function (instead of binomial distribution).

      - Line 100: composite, not compositive.

      - Line 138-139: please improve the sentence, if it's consistent with previous findings, what's the point of "surprisingly"?

      We have made revisions accordingly.

      Reviewer #2 (Recommendations For The Authors):

      Reviewer #2 Recommendations 1

      Line 294. The definition of DV, HV, LV is not sufficient. Presumably, these are the U from the following sections? Or just EV? But this is not explicitly stated, rather they are vaguely referred to as values." The computational modelling section refers to them as utilities. Are these the same thing?

      We thank the reviewer for the suggestion. We have clarified that the exact method for calculating each of the values and updated the section accordingly.

      “where HV, LV, and DV refer to the values of the chooseable higher value option, chooseable lower value option, and distractor, respectively. Here, values (except those in Supplementary Figure 5) are defined as Expected Value (EV), calculated by multiplying magnitude and probability of reward.” (Lines 348-350)

      Reviewer #2 Recommendations 2

      The analysis drops trials in which the distractor was chosen. These trials are informative about the presence (or not) of relative valuation or other factors because they make such choices more (or less) likely. Ignoring them is another example of the analysis being misspecified.

      We thank the reviewer for the suggestion and this is related to Major Issue 1 raised by the same reviewer. In brief, we adopted the same methods implemented by Cao and Tsetsos (Cao and Tsetsos, 2022) and that constrained us to applying a binomial model. Please refer to our reply to Major Issue 1 for more details.

      Reviewer #2 Recommendations 3

      Some questions and suggestions on statistics and computational modeling:

      Have the authors looked at potential collinearity between the regressors in each of the GLMs?

      We thank the reviewer for the comment. For each of the following GLMs, the average variance inflation factor (VIF) has been calculated as follows:

      GLM2 using the Expected Value model:

      Author response table 1.

      GLM2 after replacing the utility function based on the normative Expected Value model with values obtained by using the composite model:

      Author response table 2.

      GLM3:

      Author response table 3.

      As indicated in the average VIF values calculated, none of them exceed 4, suggesting that the estimated coefficients were not inflated due to collinearity between the regressor in each of the GLMs.

      Reviewer #2 Recommendations 4

      - Correlation results in Figure 4. What is the regression line displayed on this plot? I suspect the regression line came from Pearson's correlation, which would be inconsistent with the Spearman's correlation reported in the text. A reasonable way would be to transform both x and y axes to the ranked data. However, I wonder why it makes sense to use ranked data for testing the correlation in this case. Those are both scalar values. Also, did the authors assess the influence of the zero integration coefficient on the correlation result? Importantly, did the authors redo the correlation plot after defining the utility function by the composite models?

      We thank the reviewer for the suggestion. The plotted line in Figure 4 was based on the Pearson’s correlation and we have modified the text to also report the Pearson’s correlation result as well.

      If we were to exclude the 32 participants with integration coefficients smaller than 1×10-6 from the analysis, we still observe a significant positive Pearson’s correlation [r(110)=0.202, p=0.0330].

      Author response image 1.

      Figure 4 after excluding 32 participants with integration coefficients smaller than 1×10-6.

      “As such, we proceeded to explore how the distractor effect (i.e., the effect of (DV−HV)T obtained from GLM2; Figure 2c) was related to the integration coefficient (η) of the optimal model via a Pearson’s correlation (Figure 4). As expected, a significant positive correlation was observed [r(142)=0.282, p=0.000631]. We noticed that there were 32 participants with integration coefficients that were close to zero (below 1×10-6). The correlation remained significant even after removing these participants [r(110)=0.202, p=0.0330].” (Lines 207-212)

      The last question relates to results already included in Supplementary Figure 5, in which the analyses were conducted using the utility function of the composite model. We notice that although there was a difference in integration coefficient between the multiplicative and additive groups, a correlational analysis did not generate significant results [r(142)=0.124, p=0.138]. It is possible that the relationship became less linear after applying the composite model utility function. However, it is noticeable that in a series of complementary analyses (Figure 5: r(142)=0.282, p=0.000631; Supplementary Figure 3: r(142)=0.278, p=0.000746) comparable results were obtained.

      Reviewer #2 Recommendations 5

      - From lines 163-165, were the models tested on only the three-option trials or both two and three-opinion trials? It is ambiguous from the description here. It might be worth checking the model comparison based on different trial types, and the current model fitting results do not tell an absolute sense of the goodness of fit. I would suggest including the correctly predicted trial proportions in each trial type from different models.

      We thank the reviewer for the suggestion. We have only modeled the two-option trials and the key reason for this is because the two-option trials can arguably provide a better estimate of participants’ style of integrating attributes as they are independent of any distractor effects. This was also the same reason why Cao and Tsetsos applied the same approach when they were re-analyzing our data (Cao and Tsetsos, 2022). We have clarified the statement accordingly.

      “We fitted these models exclusively to the Two-Option Trial data and not the Distractor Trial data, such that the fitting (especially that of the integration coefficient) was independent of any distractor effects, and tested which model best describes participants’ choice behaviours.” (Lines 175-178)

      Reviewer #2 Recommendations 6

      - Along with displaying the marginal distributions of each parameter estimate, a correlation plot of these model parameters might be useful, given that some model parameters are multiplied in the value functions.

      We thank the reviewer for the suggestion. We have also generated the correlation plot of the model parameters. The Pearson’s correlation between the magnitude/probability weighting and integration coefficient was significant [r(142)=−0.259, p=0.00170]. The Pearson’s correlation between the inverse temperature and integration coefficient was not significant [r(142)=−0.0301, p=0.721]. The Pearson’s correlation between the inverse temperature and magnitude/probability weighting was not significant [r(142)=−0.0715, p=0.394].

      “Our finding that the average integration coefficient  was 0.325 coincides with previous evidence that people were biased towards using an additive, rather than a multiplicative rule. However, it also shows rather than being fully additive ( =0) or multiplicative ( =1), people’s choice behaviour is best described as a mixture of both. Supplementary Figure 1 shows the relationships between all the fitted parameters.” (Lines 189-193)

      Reviewer #2 Recommendations 7

      Have the authors tried any functional transformations on amounts or probabilities before applying the weighted sum? The two attributes are on entirely different scales and thus may not be directly summed together.

      We thank the reviewer for the comment. Amounts and probabilities were indeed both rescaled to the 0-1 interval before being summed, as explained in the methods (Line XXX). Additionally, we have now added and performed model fitting on an additional model with utility curvature based on the prospect theory (Kahneman & Tversky, 1979) and a weighted probability function (Prelec, 1998):

      where  and  represent the reward magnitude and probability (both rescaled to the interval between 0 and 1), respectively.  is the weighted magnitude and  is the weighted probability, while  and  are the corresponding distortion parameters. This prospect theory (PT) model was included along with the four previous models (please refer to Figure 3) in a Bayesian model comparison. Results indicate that the composite model remains as the best account of participants’ choice behaviour (exceedance probability = 1.000, estimated model frequency = 0.720).

      “Supplementary Figure 2 reports an additional Bayesian model comparison performed while including a model with nonlinear utility functions based on Prospect Theory (Kahneman & Tversky, 1979) with the Prelec formula for probability (Prelec, 1998). Consistent with the above finding, the composite model provides the best account of participants’ choice behaviour (exceedance probability = 1.000, estimated model frequency = 0.720).” (Lines 193-198)

      Reviewer #3 (Recommendations For The Authors):

      Reviewer #3 Recommendations 1

      - In the Introduction (around line 48), the authors make the case that distractor effects can co-exist in different parts of the decision space, citing Chau et al. (2020). However, if the distractor effect is calculated relative to the binary baseline this is no longer the case.

      - Relating to the above point, it might be useful for the authors to make a distinction between effects being non-monotonic across the decision space (within individuals) and effects varying across individuals due to different strategies adopted. These two scenarios are conceptually distinct.

      We thank the reviewer for the comment. Indeed, the ideas that distractor effects may vary across decision space and across different individuals are slightly different concepts. We have now revised the manuscript to clarify this:

      “However, as has been argued in other contexts, just because one type of distractor effect is present does not preclude another type from existing (Chau et al., 2020; Kohl et al., 2023). Each type of distractor effect can dominate depending on the dynamics between the distractor and the chooseable options. Moreover, the fact that people have diverse ways of making decisions is often overlooked. Therefore, not only may the type of distractor effect that predominates vary as a function of the relative position of the options in the decision space, but also as a function of each individual’s style of decision making.” (Lines 48-54)

      Reviewer #3 Recommendations 2

      - The idea of mixture models/strategies has strong backing from other Cognitive Science domains and will appeal to most readers. It would be very valuable if the authors could further discuss the potential level at which their composite model might operate. Are the additive and EV quantities computed and weighted (as per the integration coefficient) within a trial giving rise to a composite decision variable? Or does the integration coefficient reflect a probabilistic (perhaps competitive) selection of one strategy on a given trial? Perhaps extant neural data can shed light on this question.

      We thank the reviewer for the comment. The idea is related to whether the observed mixture in integration models derives from value being actually computed in a mixed way within each trial, or each trial involves a probabilistic selection between the additive and multiplicative strategies. We agree that this is an interesting question and to address it would require the use of some independent continuous measures to estimate the subjective values in quantitative terms (instead of using the categorical choice data). This could be done by collecting pupil size data or functional magnetic resonance imaging data, as the reviewer has pointed out. Although the empirical work is beyond the scope of the current behavioural study, it is worth bringing up this point in the Discussion:

      “The current finding involves the use of a composite model that arbitrates between the additive and multiplicative strategies. A general question for such composite models is whether people mix two strategies in a consistent manner on every trial or whether there is some form of probabilistic selection occurring between the two strategies on each trial such that only one strategy is used on any given trial while, on average, one strategy is more probable than the other. To test which is the case requires an independent estimation of subjective values in quantitative terms, such as by pupillometry or functional neuroimaging. Further understanding of this problem will also provide important insight into the precise way in which distractor effects operate at the single-trial level.” (Lines 275-282)

      Reviewer #3 Recommendations 3

      Line 80 "compare pairs of attributes separately, without integration". This additive rule (or the within-attribute comparison) implies integration, it is just not multiplicative integration.

      We thank the reviewer for the comment. We have made adjustments to the manuscript to ensure that the message delivered within this manuscript is consistent.

      “For clarity, we stress that the same mathematical formula for additive value can be interpreted as meaning that 1) subjects first estimate the value of each option in an additive way (value integration) and then compare the options, or 2) subjects compare the two magnitudes and separately compare the two probabilities without integrating dimensions into overall values. On the other hand, the mathematical formula for multiplicative value is only compatible with the first interpretation. In this paper we focus on attribute combination styles (multiplicative vs additive) and do not make claims on the order of the operations. More particularly, we consider whether individual differences in combination styles could be related to different forms of distractor effect.” (Lines 92-100)

      Reviewer #3 Recommendations 4

      - Not clear why the header in line 122 is phrased as a question.

      We thank the reviewer for the suggestion. We have modified the header to the following:

      “The distractor effect was absent on average” (Line 129)

      Reviewer #3 Recommendations 5

      - The discussion and integration of key neural findings with the current thesis are outstanding. It might help the readers if certain statements such as "the distractor effect is mediated by the PPC" (line 229) were further unpacked.

      We thank the reviewer for the suggestion. We have made modifications to the original passage to further elaborate the statement.

      “At the neuroanatomical level, the negative distractor effect is mediated by the PPC, where signal modulation described by divisive normalization has been previously identified (Chau et al., 2014; Louie et al., 2011). The same region is also crucial for perceptual decision making processes (Shadlen & Shohamy, 2016).” (Lines 250-253)

      Reviewer #3 Recommendations 6

      - In Fig. 3c, there seem to be many participants having the integration coefficient close to 0 but the present violin plot doesn't seem to best reflect this highly skewed distribution. A histogram would be perhaps better here.

      We thank the reviewer for the suggestion. We have modified the descriptive plots to use histograms instead of violin plots.

      “Figures 3c, d and e show the fitted parameters of the composite model: , the integration coefficient determining the relative weighting of the additive and multiplicative value ( , ); , the magnitude/probability weighing ratio ( , ); and , the inverse temperature ( , ). Our finding that the average integration coefficient  was 0.325 coincides with previous evidence that people were biased towards using an additive, rather than a multiplicative rule.” (Lines 186-191)

    2. Reviewer #3 (Public Review):

      Summary:

      The way an unavailable (distractor) alternative impacts decision quality is of great theoretical importance. Previous work, led by some of the authors of this study, had converged on a nuanced conclusion wherein the distractor can both improve (positive distractor effect) and reduce (negative distractor effect) decision quality, contingent upon the difficulty of the decision problem. In very recent work, Cao and Tsetsos (2022) reanalyzed all relevant previous datasets and showed that once distractor trials are referenced to binary trials (in which the distractor alternative is not shown to participants), distractor effects are absent. Cao and Tsetsos further showed that human participants heavily relied on additive (and not multiplicative) integration of rewards and probabilities.

      The present study by Wong et al. puts forward a novel thesis according to which interindividual differences in the way of combining reward attributes underlie the absence of detectable distractor effect at the group level. They re-analysed the 144 human participants and classified participants into a "multiplicative integration" group and an "additive integration" group based on a model parameter, the "integration coefficient", that interpolates between the multiplicative utility and the additive utility in a mixture model. They report that participants in the "multiplicative" group show a negative distractor effect while participants in the "additive" group show a positive distractor effect. These findings are extensively discussed in relation to the potential underlying neural mechanisms.

      Strengths:

      - The study is forward looking, integrating previous findings well, and offering a novel proposal on how different integration strategies can lead to different choice biases.<br /> - The authors did an excellent job in connecting their thesis with previous neural findings. This is a very encompassing perspective that is likely to motivate new studies towards better understanding of how humans and other animals integrate information in decisions under risk and uncertainty.<br /> - Despite that some aspects of the paper are very technical, methodological details are well explained and the paper is very well written.

      Weaknesses:

      - The authors quantify the distractor variable as "DV - HV", i.e., the relative distractor variable. Conclusions mostly hold when the distractor is quantified in absolute terms (as "DV", see also Cao & Tsetsos, 2023). However, it is not entirely clear why the impact of the distractor alternative is not identical when the distractor variable is quantified in absolute vs. relative terms. Although understanding this nuanced point seems to extend beyond the scope of the paper, it could provide valuable decision-theoretic (and mechanistic) insights.<br /> - The central finding of this study is that participants who integrate reward attributes multiplicatively show a positive distractor effect while participants who integrate additively show a negative distractor effect. This is a very interesting and intriguing observation. However, it does not explain why the integration strategy covaries with the direction of the distractor effect. As the authors acknowledge, the composite model is not explanatory. Although beyond the scope of this paper, it would be valuable to provide a mechanistic explanation of this covariation pattern.

    3. eLife assessment

      This manuscript provides a valuable demonstration that distractor effects in multi-attribute decision-making correlate with the form of attribute integration (additive vs. multiplicative). The evidence supporting the conclusions is convincing, but there are questions about how to interpret the findings. The manuscript will be interesting to decision-making researchers in neuroscience, psychology, and related fields.

    4. Reviewer #1 (Public Review):

      Summary:

      The current study provided a follow-up analysis using published datasets focused on the individual variability of both the distraction effect (size and direction) and the attribute integration style, as well as the association between the two. The authors tried to answer the question of whether the multiplicative attribute integration style concurs with a more pronounced and positively oriented distraction effect.

      Strengths:

      The analysis extensively examined the impacts of various factors on decision accuracy, with particular focus on using two-option trials as control trials, following the approach established by Cao & Tsetsos (2022). The statistical significance results were clearly reported.

      The authors meticulously conducted supplementary examinations, incorporating the additional term HV+LV into GLM3. Furthermore, they replaced the utility function from the expected value model with values from the composite model.

      Weaknesses:

      The authors did a great job addressing the weaknesses I raised in the previous round of review, except on the generalizability of the current result in the larger context of multi-attribute decision-making. It is not really a weakness of the manuscript but more of a limitation of the studied topic, so I want to keep this comment for public readers.

      The reward magnitude and probability information are displayed using rectangular bars of different colors and orientations. Would that bias subjects to choose an additive rule instead of the multiplicative rule? Also, could the conclusion be extended to other decision contexts such as quality and price, where a multiplicative rule is hard to formulate?

      Overall, the authors have achieved their aims after clarifying that the study was trying to establish a correlation between the integration style and attraction effect. This result may be useful to inspire neuroimaging or neuromodulation studies that investigate multi-attribute decision making.

    5. Reviewer #2 (Public Review):

      This paper addresses the empirical demonstration of "distractor effects" in multi-attribute decision-making. It continues a debate in the literature on the presence (or not) of these effects, which domains they arise in, and their heterogeneity across subjects. The domain of the study is in a particular type of multi-attribute decision-making: choices over risky lotteries. The paper reports a re-analysis of lottery data from multiple experiments run previously by the authors and other labs involved in the debate.

      Methodologically, the analysis assumes a number of simple forms for how attributes are aggregated (adaptively, or multiplicatively, or both) and then applies a "reduced form" logistic regression to the choices with a number of interaction terms intended to control for various features of the choice set. One of these interactions, modulated by ternary/binary treatment, is interpreted as a "distractor effect."

      The claimed contribution of the re-analysis is to demonstrate correlation in the strength/sign of this treatment effect with another estimated parameter: the relative mixture of additive/multiplicative preferences.

      Major Issues

      (1) How to Interpret GLM 1 and 2

      This paper, and others before it, have used a binary logistic regression with a number of interaction terms to attempt to control for various features of the choice set and how they influence choice. It is important to recognize that this modelling approach is not derived from a theoretical claim about the form of the computational model that guides decision-making in this task, nor an explicit test for a distractor effect. This can be seen most clearly in the equations after line 321 and its corresponding log-likelihood after 354, which contain no parameter or test for "distractor effects". Rather the computational model assumes a binary choice probability, and then shoehorns the test for distractor effects via a binary/ternary treatment interaction in a separate regression (GLM 1 and 2). This approach has already led to multiple misinterpretations in the literature (see Cao & Tsetsos, 2022; Webb et al., 2020). One of these misinterpretations occurred in the datasets the authors study, in which the lottery stimuli contained a confound with the interaction that Chau et al., (2014) were interpreting as a distractor effect (GLM 1). Cao & Tsetsos (2022) demonstrated that the interaction was significant in binary choice data from the study, therefore it can not be caused by a third alternative. This paper attempts to address this issue with a further interaction with the binary/ternary treatment (GLM 2). Therefore the difference in the interaction across the two conditions is claimed to now be the distractor effect. The validity of this claim brings us to what exactly is meant by a "distractor effect."

      The paper begins by noting that "Rationally, choices ought to be unaffected by distractors" (line 33). This is not true. There are many normative models which allow for the value of alternatives (even low-valued "distractors") to influence choices, including a simple random utility model. Since Luce (1959), it has been known that the axiom of "Independence of Irrelevant Alternatives" (that the probability ratio between any two alternatives not depend on a third) is an extremely strong axiom, and only a sufficiency axiom for a random utility representation (Block and Marschak, 1959). It is not a necessary condition of a utility representation, and if this is our definition of rational (which is highly debatable), not necessary for it either. Countless empirical studies have demonstrated that IIA is falsified, and a large number of models can address it, including a simple random utility model with independent normal errors (i.e. a multivariate Probit model). In fact, it is only the multinomial Logit model that imposes IIA. It is also why so much attention is paid to the asymmetric dominance effect, which is a violation of a necessary condition for random utility (the Regularity axiom).

      So what do the authors even mean by a "distractor effect." It is true that the form of IIA violations (i.e. their path through the probability simplex as the low-option varies) tells us something about the computational model underlying choice (after all, different models will predict different patterns). But we do not know how the interaction terms in the binary logit regression relate to the pattern of the violations because there is no formal theory that relates them. Any test for relative value coding is a joint test of the computational model and the form of the stochastic component (Webb et al,. 2020). These interaction terms may simply be picking up substitution patterns that can be easily reconciled with some form of random utility. While we can not check all forms of random utility in these datasets (because the class of such models is large), this paper doesn't even rule any of these models out.

      (2) How to Interpret the Composite (Mixture) model?

      On the other side of the correlation is the results from the mixture model for how decision-makers aggregate attributes. The authors report that most subjects are best represented by a mixture between additive and multiplicative aggregation models. The authors justify this with the proposal that these values are computed in different brain regions and then aggregated (which is reasonable, though raises the question of "where" if not the mPFC). But an equally reasonable interpretation is that the improved fit of the mixture model simply reflects a misspecification of two extreme aggregation process (additive and EV), so the log-likelihood is maximized at some point in between them.

      One possibility is a model with utility curvature. How much of this result is just due to curvature in valuation? There are many reasonable theories for why we should expect curvature in utility for human subjects (for example, limited perception: Robson, 2001, Khaw, Li Woodford, 2019; Netzer et al., 2022) and of course many empirical demonstrations of risk aversion for small stakes lotteries. The mixture model, on the other hand, has parametric flexibility.

      There is also a large literature on testing expected utility jointly with stochastic choice, and the impact of these assumptions on parameter interpretation (Loomes & Sugden, 1998; Apesteguia & Ballester, 2018; Webb, 2019). This relates back to the point above: the mixture may reflect the joint assumption of how choice departs from deterministic EV.

      (3) So then how should we interpret the correlation that the authors report?

      On one side we have the impact of the binary/ternary treatment which demonstrates some impact of the low value alternative on a binary choice probability. This may reflect some deep flaw in existing theories of choice, or it may simply reflect some departure from purely deterministic expected value maximization that existing theories can address. We have no theory to connect it to, so we cannot tell. On the other side of the correlation with have the mixture between additive and multiplicative preferences over risk. This result may reflect two distinct neural processes at work, or it may simply reflect a misspecification of the manner in which humans perceive and aggregate attributes of a lottery (or even just the stimuli in this experiment) by these two extreme candidates (additive vs. EV). Again, this would entail some departure from purely deterministic expected value maximization that existing theories can address.

      It is entirely possible that the authors are reporting a result that points to the more exciting of these two possibilities. But it is also possible (and perhaps more likely) that the correlation is more mundane. The paper does not guide us to theories that predict such a correlation, nor reject any existing ones. In my opinion, we should be striving for theoretically-driven analyses of datasets, where the interpretation of results is clearer.

      (4) Finally, the results from these experiments might not have external validity for two reasons. First, the normative criterion for multi-attribute decision-making differs depending on whether the attributes are lotteries or nor (i.e. multiplicative vs additive). Whether it does so for humans is a matter of debate. Therefore if the result is unique to lotteries, it might not be robust for multi-attribute choice more generally. The paper largely glosses over this difference and mixes literature from both domains. Second, the lottery information was presented visually and there is literature suggesting this form of presentation might differ from numerical attributes. Which is more ecologically valid is also a matter of debate.

      Minor Issues:

      The definition of EV as a normative choice baseline is problematic. The analysis requires that EV is the normative choice model (this is why the HV-LV gap is analyzed and the distractor effect defined in relation to it). But if the binary/ternary interaction effect can be accounted for by curvature of a value function, this should also change the definition of which lottery is HV or LV for that subject!

      Comments on latest version: the authors did respond to some of my comments with discussion points in the paper.

      References

      Apesteguia, J. & Ballester, M. Monotone stochastic choice models: The case of risk and time preferences. Journal of Political Economy (2018).

      Block, H. D. & Marschak, J. Random Orderings and Stochastic Theories of Responses. Cowles Foundation Discussion Papers (1959).

      Khaw, M. W., Li, Z. & Woodford, M. Cognitive Imprecision and Small-Stakes Risk Aversion. Rev. Econ. Stud. 88, 1979-2013 (2020).

      Loomes, G. & Sugden, R. Testing Different Stochastic Specifications of Risky Choice. Economica 65, 581-598 (1998).

      Luce, R. D. Indvidual Choice Behaviour. (John Wiley and Sons, Inc., 1959).

      Netzer, N., Robson, A. J., Steiner, J. & Kocourek, P. Endogenous Risk Attitudes. SSRN Electron. J. (2022) doi:10.2139/ssrn.4024773.

      Robson, A. J. Why would nature give individuals utility functions? Journal of Political Economy 109, 900-914 (2001).

      Webb, R. The (Neural) Dynamics of Stochastic Choice. Manage Sci 65, 230-255 (2019).

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      This study presents a useful comparison of the dynamic properties of two RNA-binding domains. The data collection and analysis are solid, making excellent use of a suite of NMR methods. However, evidence to support the proposed model linking dynamic behavior to RNA recognition and binding by the tandem domains remains incomplete. The work will be of interest to biophysicists working on RNA-binding proteins.

      We thank eLife for taking the time and effort to review our manuscript. Evidence from the literature and our study shows a great deal of parity between the dynamic behavior of dsRBDs and its dsRNA-recognition and -binding that helped us culminate in proposing a fair model. As already mentioned in the manuscript, we have been working on the suggested experiments to support our proposed model further.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In the manuscript entitled "Differential conformational dynamics in two type-A RNA-binding domains drive the double-stranded RNA recognition and binding," Chugh and co-workers utilize a suite of NMR relaxation methods to probe the dynamic landscape of the TAR RNA binding protein (TRBP) double-stranded RNA-binding domain 2 (dsRBD2) and compare these to their previously published results on TRBP dsRBD1. The authors show that, unlike dsRBD1, dsRBD2 is a rigid protein with minimal ps-ns or us-ms time scale dynamics in the absence of RNA. They then show that dsRBD2 binds to canonical A-form dsRNA with a higher affinity compared to dsRBD1 and does so without much alteration in protein dynamics. Using their previously published data, the authors propose a model whereby dsRBD2 recognizes dsRNA first and brings dsRBD1 into proximity to search for RNA bulge and internal loop structures.

      We thank the Reviewer for sending us an encouraging review. We have combined the findings reported in the literature with new ones that led us to propose the dsRNA-binding model by tandem A-form dsRBDs.

      We propose that dsRBD1 can first recognize a variety of sequential and structurally different dsRNAs. dsRBD2 assists the interaction with a higher affinity, thus fortifying the interaction between TRBP and a possible substrate. This may enable the other associated proteins like Dicer and Ago2 to perform critical biological functions.

      However, we feel that a few statements in the comment above are factually incorrect.

      Statement 1. “They then show that dsRBD2 binds to canonical A-form dsRNA with a higher affinity compared to dsRBD1 and does so without much alteration in protein dynamics.”

      We have explicitly shown the perturbation in dsRBD2 dynamics upon RNA binding.

      Statement 2. “Using their previously published data, the authors propose a model whereby dsRBD2 recognizes dsRNA first and brings dsRBD1 into proximity to search for RNA bulge and internal loop structures.”

      Our previously published data suggests that dsRBD1, owing to its high conformational dynamics in solution, is able to recognize a variety of structurally and sequentially different dsRNAs ([Paithankar et al., 2022]). dsRBDs preferably bind to the double-stranded region (minor-major-minor-groove) of an A-form RNA ([Acevedo et al., 2016]; [Vuković et al., 2014]) and do not search for bulge and internal loop structures as a part of the binding event. Even though dsRBDs preferably bind to the double-stranded region, they can still accommodate perturbation in the A-form helix due to mismatch and bulges with decreased binding affinity ([Acevedo et al., 2015]). However, it is a matter of future research to identify how much of a deviation from the A-form structure can be accommodated by the dsRBDs. The diffusion event observed in the literature ([Koh et al., 2013]) also does not show any direct implication for searching for bulge and internal loop structures.

      Strengths:

      The authors expertly use a variety of NMR techniques to probe protein motions over six orders of magnitude in time. Other NMR titration experiments and ITC data support the RNA-binding model.

      Weaknesses:

      The data collection and analysis are sound. The only weakness in the manuscript is the lack of context with the much broader field of RNA-binding proteins. For example, many studies have shown that RNA recognition motif (RRM) domains have similar dynamic characteristics when binding diverse RNA substrates. Furthermore, there was no discussion about the entropy of binding derived from ITC. It might be interesting to compare with dynamics from NMR.

      We understand the reviewer’s point that this study is focused on a dsRNA-binding mechanism rather than addressing the much broader field of RNA-binding. There are multiple challenges in finding a single mechanism that works for all RNA-binding proteins. For instance, RRM is a single-stranded RNA binding domain that is able to read out the substrate base sequence. RRM behaves entirely differently than the dsRBD in terms of target specificity. Besides, several other RNA-binding domains, like the KH-domain, Puf domains, Zinc finger domains, etc., showcase a unique RNA-binding behavior. Thus, it would be really difficult to draw a single rule of thumb for RNA-recognition behavior for all these diverse domains.

      Thank you for pointing out the entropy of binding from ITC. We have now included the entropy of binding discussion in the main text, page 7.

      Reviewer #2 (Public Review):

      Summary:

      Proteins that bind to double-stranded RNA regulate various cellular processes, including gene expression and viral recognition. Such proteins often contain multiple double-stranded RNA-binding domains (dsRBDs) that play an important role in target search and recognition. In this work, Chug and colleagues have characterized the backbone dynamics of one of the dsRBDs of a protein called TRBP2, which carries two tandem dsRBDs. Using solution NMR spectroscopy, the authors characterize the backbone motions of dsRBD2 in the absence and presence of dsRNA and compare these with their previously published results on dsRBD1. The authors show that dsRBD2 is comparatively more rigid than dsRBD1 and claim that these differences in backbone motions are important for target recognition.

      Strengths:

      The strengths of this study are multiple solution NMR measurements to characterize the backbone motions of dsRBD2. These include 15N-R1, R2, and HetNOE experiments in the absence and presence of RNA and the analysis of these data using an extended-model-free approach; HARD-15N-experiments and their analysis to characterize the kex. The authors also report differences in binding affinities of dsRBD1 and dsRBD2 using ITC and have performed MD simulations to probe the differential flexibility of these two domains.

      Weaknesses:

      While it may be true that dsRBD2 is more rigid than dsRBD1, the manuscript lacks conclusive and decisive proof that such changes in backbone dynamics are responsible for target search and recognition and the diffusion of TRBP2 along the RNA molecule. To conclusively prove the central claim of this manuscript, the authors could have considered a larger construct that carries both RBDs. With such a construct, authors can probe the characteristics of these two tandem domains (e.g., semi-independent tumbling) and their interactions with the RNA. Additionally, mutational experiments may be carried out where specific residues are altered to change the conformational dynamics of these two domains. The corresponding changes in interactions with RNA will provide additional evidence for the model presented in Figure 8 of the manuscript. Finally, there are inconsistencies in the reported data between different figures and tables.

      We thank the reviewer for the comprehensive and insightful review. A larger construct carrying both RBDs was not used because of the multiple challenges pertaining to dynamics study by NMR spectroscopy (intrinsic R2 rates of the dsRBD1-dsRBD2 construct would be high, resulting in broadened peaks) as per our previous experience ([Paithankar et al., 2022]). There would be additional dynamics in that construct coming from domain-domain relative motions, and it is difficult to deconvolute the dynamics information. Further, the dsRNA needed to bind to this construct will be longer, causing further line broadening in NMR.

      Coming to mutational studies, careful designing of domain mutants remains as a challenge because the conformational dynamics in both the domains are distributed all through the backbone rather than only in the RNA-binding residues. The mutational studies would need an exhaustive number of mutations in protein as well as RNA to draw a parallel between the binding and dynamics. Having said that, we are working on making such mutations in the protein (at several locations to freeze the dynamics site-specifically) and the RNA (to change the shape of the dsRNA) to systematically study this mechanism, which will be out of scope of this manuscript.

      The reviewer has rightly pointed out some subtle superficial differences in the reported data between different figures and tables. These superficial differences are present because of the context in which we are describing the data. For example, in Figure S4, we are talking about the average relaxation rates and nOe values for only the common residues we were able to analyze between two magnetic field strengths 600 and 800 MHz. Whereas in Figure 6, we are comparing the averages of the core (159-227) dsRBD residues at 600 MHz, in the presence and absence of D12RNA. The differences, however, are minute falls well within the error range.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Suggestions for improved or additional experiments -

      In regards to ITC data, dsRBD1 does not bind canonical A-form RNA with high affinity. What is dsRBD1 and dsRBD2 affinity to the miR-16 RNA?

      We have not performed ITC-based studies with miR-16 RNA for the domains. The study by Acevedo et al. has shown the effect of lengths of Watson-Crick duplex RNAs upon TRBP2 dsRBD binding. In this study, they have compared the ds22 RNA to miRNA/miRNA* duplex. By using EMSA, they show that the Kd,app (μM) for dsRBD1 is 3.5±0.2 and for dsRBD2 is 1.7±0.1, indicating a higher affinity by the latter ([Acevedo et al., 2015]).

      What was the amount of time used for the 1H saturation in the heteronuclear NOE experiment? Based on the average T1 (1/1.44 s-1) = 0.69 s, a recovery delay of >7 s should have been used for this experiment.

      According to Cavanagh et al., a minimum recovery/recycle delay should be greater than 5*1/R1 to make sure that 99% of the 1HN and 15N magnetizations are restored ([“Protein NMR Spectroscopy, Principles and Practice, John Cavanagh, Wayne J. Fairbrother, Arthur G. Palmer III, and Nicholas J. Skelton. Academic Press, San Diego, 1995, 587 pages, $59.95. ISBN: 0-12-164490-1.,” 1996]). In our study, we have used a relaxation delay of 5 s, which is greater than 7*1/R1avg thus ensuring at least 99% of the 1HN and 15N recover their bulk magnetization.

      Recommendations for improving writing and presentation -

      Figure 3 - The legend in panel C is incomplete.

      Figure 3 (Figure 4 in the revised manuscript) has been updated, and the legend now reads complete.

      Figures 3 E and F - The three views can be combined into one as is done in Figures 4 C and D.

      Thanks for the kind suggestion. We have depicted the kex in the three ranges to highlight the difference between the two domains at each range. Since there are three different exchange regimes with different populations, we believe this gives us an uncomplicated picture while classifying and comparing the dynamics between the two. Combining the three views into one becomes too overwhelming to visualize kex and population distribution in the protein.

      Figure 3 - The residues indicated in the text (e.g., R200, L212, and R224) should be indicated in panels E and F.

      We have marked the residues described in the text in Figure 4C (revised Figure 5C), and thus, they are not mentioned in Figures 3E and 3F (revised Figures 4E and 4F).

      The results and discussion put these findings into minimal context. Most comparisons are made between dsRBD1 and dsRBD2. What about other RNA-binding proteins? There is a wealth of structure/dynamics/functional data about RNA recognition motifs, which do exactly the same thing as described here but are missing.

      We understand the reviewer’s point that this study is focused on a dsRNA-binding mechanism rather than addressing the much broader field of RNA-binding. There are multiple challenges in finding a single mechanism that works for all RNA-binding proteins. For instance, RRM is a single-stranded RNA-recognition motif that can read out the substrate base sequence. RRM behaves entirely differently than the dsRBD in terms of sequence specificity. Besides, several other RNA-binding domains, like the KH-domain, Puf domains, Zinc-finger domains, etc., showcase a unique RNA-binding behaviour. Thus, with the current knowledge, it would not be possible to draw a single rule of thumb for RNA-recognition behaviour for all these diverse domains. Hence, the findings of this study are not comparable to those of other RNA-binding domains and are beyond the scope of this study.

      Results, page 8 - I'm not sure that allosteric quenching is appropriately invoked here. The amount of residues showing dynamics in the apo state is small and the number only moderately increases upon RNA binding. The observation that some residues show an increase and a neighboring residue shows a decrease (or vice versa) upon RNA binding could just be random with the small number of observations. This observation would be more convincing if it were happening to larger regions within the protein.

      We agree with the reviewer that the number of residues showing dynamics in the apo-state of the dsRBD2 is small when compared with that of dsRBD1, and the number only moderately increases upon RNA-binding. However, we believe it is quite important to invoke the allosteric quenching as all the new residues where dynamics is induced, do lie in the spatial proximity, as also observed in the dsRBD1 ([Paithankar et al., 2022]). It is a parameter to not only compare the differences and similarities in the two domains but also to highlight the presence of this phenomenon common in both the type-A dsRBDs of TRBP.

      Minor corrections -

      Introduction, page 2 - The order parameter should be defined for non-NMR experts.

      Thank you for the suggestion. The definition of order parameter has now been included on page 2 of the revised manuscript.

      Introduction, page 2 - TRBP should be defined in the main text the first time used.

      We have now defined TRBP on page 2 of the revised manuscript, where it is used in the main text for the first time.

      Results, page 5 - The reference for the HARD experiment should be given earlier in that paragraph.

      Thank you for the suggestion. We have now referenced the HARD experiment earlier in the last paragraph on page 5 of the revised manuscript.

      Results, page 7 - What is the limiting amount of RNA used for the D12-bound dsRBD2 spin relaxation measurements?

      The limiting amount of RNA used for the D12-bound dsRBD2 spin relaxation measurements is 0.05 equivalent (RNA:Protein= 50 mM:1000 mM). It has now been included on page 7 of the revised manuscript.

      Reviewer #2 (Recommendations For The Authors):

      Throughout the manuscript, NMR datasets are not consistent with one another (a few examples are listed below).

      Figures S4, 6, and Table S4: (a) It is unclear why relaxation data for certain residues are missing in Table S4 (e.g., S156, V168, E177, F192, etc.).

      We thank the reviewer for pointing this out. We have now reanalyzed the data for all the above-mentioned residues and other missing residues. In the revised manuscript, we have added the data for the above-mentioned residues like E177, R189, and many more N- and C-terminal residues. Unfortunately, for some residues like V168, S184, F192, S209, and L222, we witnessed severe peak broadening while measuring the R2 rates and/or nOe. Hence, data for V168, S184, F192, S209, and L222 are missing in Table S4. We have explicitly mentioned this in the table legends about missing data for a few residues.

      (b) The reported values are not consistent. For example, Figure S4 says that the average 15N-R2 rate is 10.85 +/- 0.36 s-1 whereas Figure 6 says the 15N-R2 rate is 11.02 +/- 0.39 s-1 for the same dataset.

      The superficial differences are present because of the context in which we are describing the data (now mentioned in the methods section on page 13). In Figure S4, we are talking about the average relaxation rates and nOe values for only the common residues we could analyze between two magnetic field strengths, 600 and 800 MHz. Whereas in Figure 6 (revised figure 3), we compare the averages of all the analyzed core dsRBD residues at 600 MHz in the presence and absence of D12RNA. The differences, however, are insignificant, falling well within the error range.

      (c) There is also a discrepancy in reported R2 values (at 600 MHz) in Table S4. It is unclear to me what the reported values are, as most of these are below 1 s-1.

      Thank you very much for pointing out our mistake here. The Table S4 seems to have the wrong values for R2 at 600 MHz. However, the raw data submitted to the BMRB as entry 52077 holds the correct information. We have now updated the Table S4.

      (d) It is also unclear as to why perfectly resolved residues (e.g., L230, A232, D234, etc.) have been omitted from these data (and other datasets such as 15N-CPMGs shown in Figure S6).

      The residues L230, A232, D234, etc., are the C-terminal residues of TRBP-dsRBD2 beyond the core (159-227 aa) fold of dsRBD. They have now been included in the revised figures S6 and S11 for completeness.

      (e) Figure 6 reports a 15N-R2 of 21 s-1 for one of the residues in the absence of RNA. This data point has been omitted from Figure S4.

      In Figure S4, we are talking about relaxation rates and nOe values only for the common residues we could analyze between the two magnetic field strengths, 600 and 800 MHz. Thus, that 15N-R2 value has been omitted.

      The S2 order parameters reported in Figures S5 and S10 are inconsistent with one another, as additional residues are shown in S10 (e.g., N159).

      Thank you for pointing it out. We have now reanalyzed the data for S2 order parameter and Rex by including more residues (e.g., N159, R189, etc) in the core and have updated both Figures S5 and S10. Please see the revised supplementary information.

      Tables S6 and S7 report values for residue R189. This residue has been omitted in every other dataset. Based on the 1H-15N HSQC spectrum shown in Figure S3, this residue gives a well-resolved crosspeak (which lies adjacent to V228). Can the authors explain why they omit data for this residue in Figures S4, 6, and Table S4?

      The reviewer is correct in pointing out that data for R189 is missing in the fast dynamics data, such as Figure S4, Figure 6 (revised figure 3), and Table S4. We have now reanalyzed our raw data and included data for R189 and other missing residues in our updated manuscript. Please see the revised figures S4 and 6 (revised figure 3) and the revised table S4.  

      Moreover, this residue lies in the loop2 region of this domain. Based on the MD simulations (Figure 2), this region is more flexible compared to the rest of the domain. Does the corresponding 15N-relaxation data support this claim?

      Yes, the apo 15N-relaxation data do strongly support this claim. R189 showed a higher than core average R2 rate (R189 = 15.44 +/- 0.69 s-1; core = 10.92 +/- 0.37 s-1) and a lower than core average nOe (R189 = 0.49 +/- 0.05; core = 0.73 +/- 0.03) which indicate a higher flexibility than the rest of the core (updated Figure 3 and Table S4). Additionally, the S2 order parameter for R189 was found to be 0.52 +/- 0.03, slightly lower than the core average of 0.59 +/- 0.03, indicating a more flexible region than the core (updated Table S14). Moreover, the dynamics parameters extracted from HARD experimental data using the geoHARD method for apo TRBP2-dsRBD2 shown in Table S18 depict a high kex value of 31748.72 +/- 955.20 Hz for R189. This supports the claim that this residue is highly flexible with a high exchange rate.

      Figure S9. I was not able to follow this dataset as the data points are not consistent between different residues.

      In Figure S9, the residue-wise peak intensities plotted against the RNA concentration indicate that line broadening was witnessed for all the core residues (irrespective of the initial peak intensity). Another interesting observation is that the terminal residues do not undergo the same line broadening as seen in the core residues.

      It is also unclear why residue G185 is highlighted.

      It is taken as an example and magnified to show the extent of line broadening. This is now explicitly mentioned in the figure caption in the revised supplementary information.

      It is also not clear exactly what the authors are trying to fit, as I see no chemical shift changes upon the addition of RNA (Fig. S8), and the equation used for data fitting (pg. 11) uses chemical shift changes (and not the changes in intensities).

      The same equation can be used to fit the chemical shift perturbation and peak intensity perturbation as a function of ligand concentration. Here, we have tried to fit the intensity perturbation. We have now modified the statement on page 11 in the revised manuscript.

      Table S2: The ITC analysis reports an n value of ~3. Can authors elaborate as to what this means?

      The stoichiometry ~3 indicates the number of TBDP2-dsRBD2 that can interact with D12 RNA in a single binding event. The minimum binding register for dsRBDs is known to be >8 bp (12 bp for optimal binding) ([Ramos et al., 2000]), and one single domain only covers one-third of the face of the cylindrical RNA ([Masliah et al., 2018]). Hence, 3 dsRBD2 could interact with a 12-mer RNA in solution.

      The reported Kd values between the main text (page 7) and Figure 5 are not consistent with one another (one lists 1.18 uM while the other says 1.11 uM). Table S2 does not list the parameters for interactions between dsRBD1 and D12.

      Figure 5 (revised figure 6) depicts the information of a single isolated experiment out of a total of three, whereas in the main text, we say 1.18 μM as the average Kd value (table S2).

      Figure S4: The red axis should read "211" instead of "111".

      Thank you for your helpful insight. We have now changed it in the revised figure.

      Table S3 lists the structural motifs of the two dsRBDs, which are nearly identical to one another, and yet the manuscript claims that these are different (page 4, paragraph 1).

      We agree with the reviewer that the differences are minute but important, which we have tried to highlight in this paper. In particular, loop 2, critical for dsRNA-binding ([Masliah et al., 2012]), is 1 residue longer in dsRBD2 and has a possible effect in enhanced substrate binding.

      Figure S8 shows severe signal attenuation for many residues upon the addition of 100 uM RNA. The most notable among these are residues M194, T195, and C196. Can the authors explain how they measure 15N-relaxation rates for these residues in the presence of 50 uM D12?

      First, we have recorded the measured 15N-relaxation rates for these residues in the presence of 50 mM D12 (RNA:Protein= 50 mM:1000 mM)), corresponding to 0.05 equivalent RNA. The amount of RNA used is less than that used for the HSQC-based titration shown in Figure S8, 0.1 equivalent RNA (RNA:Protein = 5 mM:50 mM), where we witness line broadening for residues like M194, T195, and C196. Second, we increased the overall protein concentration from 50 mM (used in HSQC-based titration) to 1000 mM (used in relaxation measurements) to ensure a better signal-to-noise ratio in all the spectra.

      Use the same coloring scheme for Figures S7 and S8.

      Thank you for the suggestion. We have now edited Figure S8 accordingly.

      Figures are often listed out-of-order, making it difficult to follow the manuscript.

      Thank you for the suggestion. We have now amended the main text to refer to the figures sequentially. While doing so, we have renumbered Figure 6 as Figure 3, Figure 3 as Figure 4, Figure 4 as Figure 5, and Figure 5 as Figure 6.

      Figure captions for the relaxation data should specify the temperature at which these datasets were collected.

      Thanks for the valuable suggestion. We have now added the temperature wherever applicable.

      References

      Acevedo R, Evans D, Penrod KA, Showalter SA. 2016. Binding by TRBP-dsRBD2 Does Not Induce Bending of Double-Stranded RNA. Biophys J 110:2610–2617. doi:10.1016/j.bpj.2016.05.012

      Acevedo R, Orench-Rivera N, Quarles KA, Showalter SA. 2015. Helical Defects in MicroRNA Influence Protein Binding by TAR RNA Binding Protein. PLoS ONE 10:e0116749. doi:10.1371/journal.pone.0116749

      Koh HR, Kidwell MA, Ragunathan K, Doudna JA, Myong S. 2013. ATP-independent diffusion of double-stranded RNA binding proteins.

      Masliah G, Barraud P, Allain FH-T. 2012. RNA recognition by double-stranded RNA binding domains: a matter of shape and sequence. Cell Mol Life Sci 70:1875–1895. doi:10.1007/s00018-012-1119-x

      Masliah G, Maris C, König SL, Yulikov M, Aeschimann F, Malinowska AL, Mabille J, Weiler J, Holla A, Hunziker J, Meisner‐Kober N, Schuler B, Jeschke G, Allain FH. 2018. Structural basis of siRNA recognition by TRBP double‐stranded RNA binding domains. EMBO J 37:e97089. doi:10.15252/embj.201797089

      Paithankar H, Tarang GS, Parvez F, Marathe A, Joshi M, Chugh J. 2022. Inherent conformational plasticity in dsRBDs enables interaction with topologically distinct RNAs. Biophys J 121:1038–1055. doi:10.1016/j.bpj.2022.02.005

      Protein NMR Spectroscopy, Principles and Practice, John Cavanagh, Wayne J. Fairbrother, Arthur G. Palmer III, and Nicholas J. Skelton. Academic Press, San Diego, 1995, 587 pages, $59.95. ISBN: 0-12-164490-1. 1996. . J Magn Reson, Ser B 113:277. doi:10.1006/jmrb.1996.0189

      Ramos A, Grünert S, Adams J, Micklem DR, Proctor MR, Freund S, Bycroft M, Johnston DS, Varani G. 2000. RNA recognition by a Staufen double‐stranded RNA‐binding domain. EMBO J 19:997–1009. doi:10.1093/emboj/19.5.997

      Vuković L, Koh HR, Myong S, Schulten K. 2014. Substrate Recognition and Specificity of Double-Stranded RNA Binding Proteins. Biochemistry 53:3457–3466. doi:10.1021/bi500352s

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review): 

      Summary: 

      Campbell et al investigated the effects of light on the human brain, in particular the subcortical part of the hypothalamus during auditory cognitive tasks. The mechanisms and neuronal circuits underlying light effects in non-image forming responses are so far mostly studied in rodents but are not easily translated in humans. Therefore, this is a fundamental study aiming to establish the impact light illuminance has on the subcortical structures using the high-resolution 7T fMRI. The authors found that parts of the hypothalamus are differently responding to illuminance. In particular, they found that the activity of the posterior hypothalamus increases while the activity of the anterior and ventral parts of the hypothalamus decreases under high illuminance. The authors also report that the performance of the 2-back executive task was significantly better in higher illuminance conditions. However, it seems that the activity of the posterior hypothalamus subpart is negatively related to the performance of the executive task, implying that it is unlikely that this part of the hypothalamus is directly involved in the positive impact of light on performance observed. Interestingly, the activity of the posterior hypothalamus was, however, associated with an increased behavioural response to emotional stimuli. This suggests that the role of this posterior part of the hypothalamus is not as simple regarding light effects on cognitive and emotional responses. This study is a fundamental step towards our better understanding of the mechanisms underlying light effects on cognition and consequently optimising lighting standards. 

      Strengths: 

      While it is still impossible to distinguish individual hypothalamic nuclei, even with the highresolution fMRI, the authors split the hypothalamus into five areas encompassing five groups of hypothalamic nuclei. This allowed them to reveal that different parts of the hypothalamus respond differently to an increase in illuminance. They found that higher illuminance increased the activity of the posterior part of the hypothalamus encompassing the MB and parts of the LH and TMN, while decreasing the activity of the anterior parts encompassing the SCN and another part of TMN. These findings are somewhat in line with studies in animals. It was shown that parts of the hypothalamus such as SCN, LH, and PVN receive direct retinal input in particular from ipRGCs. Also, acute chemogenetic activation of ipRGCs was shown to induce activation of LH and also increased arousal in mice. 

      Weaknesses: 

      While the light characteristics are well documented and EDI calculated for all of the photoreceptors, it is not very clear why these irradiances and spectra were chosen. It would be helpful if the authors explained the logic behind the four chosen light conditions tested. Also, the lights chosen have cone-opic EDI values in a high correlation with the melanopic EDI, therefore we can't distinguish if the effects seen here are driven by melanopsin and/or other photoreceptors. In order to provide a more mechanistic insight into the light-driven effects on cognition ideally one would use a silent substitution approach to distinguish between different photoreceptors. This may be something to consider when designing the follow-up studies. 

      Reviewer #1 (Recommendations For The Authors): 

      (1) As suggested in the public review more information regarding the reasons behind the chosen light condition is needed. 

      While the light characteristics are well documented and EDI calculated for all of the photoreceptors, it is not very clear why these irradiances and spectra were chosen. It would be helpful if the authors explained the logic behind the four chosen light conditions tested. Also, the lights chosen have cone-opic EDI values in a high correlation with the melanopic EDI, therefore we can't distinguish if the effects seen here are driven by melanopsin or cone opsins. In order to provide a more mechanistic insight into the light-driven effects on cognition ideally one would use a silent substitution approach to distinguish between different photoreceptors. 

      (2) In support of this work, it was shown in mice that acute activation of ipRGCs using chemogenetics induces c-fos in some of the hypothalamic brain areas discussed here including LH (Milosavljevic et al, 2016 Curr Biol). Another study to consider including in the discussion is by Sonoda et al 2020 Science, in which the authors showed that a subset of ipRGCs release GABA. 

      (3) Figure 1 looks squashed, especially the axes. Also, Figure 2 looks somewhat blurry. I would suggest that the authors edit the figures to correct this.

      We thank the reviewer for their positive comments and agree with the weaknesses they pointed out. 

      (1) The explanation regarding the choice of the illuminance is now included in the revised manuscript (PAGE 17): “Blue-enriched light illuminances were set according to the technical characteristics of the light source and to keep the overall photon flux similar to prior 3T MRI studies of our team (between ~1012 and 1014 ph/cm²/s) (Vandewalle et al., 2010, 2011). The orange light was introduced as a control visual stimulation for potential secondary whole-brain analyses. For the present region of interest analyses, we discarded colour differences between the light conditions and only considered illuminance as indexed by mel EDI lux. This constitutes a limitation of our study as it does not allow attributing the findings to a particular photoreceptor class.”

      The revised discussion makes clear that these choices limit the interpretation about the photoreceptors involved (PAGES 12-13): “We based our rationale and part of our interpretations on ipRGC projections, which have been demonstrated in rodents to channel the NIF biological impact of light and incorporate the inputs from rods and cones with their intrinsic photosensitivity into a light signal that can impact the brain (Güler et al., 2008; Tri & Do, 2019). Given the polychromatic nature of the light we used, classical photoreceptors and their projections to visual brain areas are, however, very likely to have directly or indirectly contributed to the modulation by light of the regional activity of the hypothalamus.”

      The discussion also points out the promises of silent substitution (PAGE 13): “Future human studies could isolate the contribution of each photoreceptor class to the impact of light on cognitive brain functions by manipulating prior light history (Chellappa et al., 2014) or through the use of silent substitutions between metameric light exposures (Viénot et al., 2012)”.

      (2) We now refer to the studies by Milosavljevic et al. and Sonoda et al. 

      PAGE 9: “Our data may therefore be compatible with an increase in orexin release by the LH with increasing illuminance. In line with this assumption, chemoactivation of ipRGCs lead to increase c-fos production, a marker of cellular activation, over several nuclei of the hypothalamus, including the lateral hypothalamus (Milosavljevic et al., 2016). If this initial effect of light we observe over the posterior part of the hypothalamus was maintained over a longer period of exposure, this would stimulate cognition and maintain or increase alertness (Campbell et al., 2023) and may also be part of the mechanisms through which daytime light increases the amplitude in circadian variations of several physiological features (BanoOtalora et al., 2021; Dijk et al., 2012).”

      PAGE 10: “Chemoactivation of ipRGCs in rodents led to an increase activity of the SCN, over the inferior anterior hypothalamus, but had no impact on the activity of the VLPO, over the superior anterior hypothalamus (Milosavljevic et al., 2016). How our findings fit with these fine-grained observations and whether there are species-specific differences in the responses to light over the different part of the hypothalamus remains to be established.”

      PAGE 10: “In terms of chemical communication, these changes in activity could be the results of an inhibitory signal from a subclass of ipRGCs, potentially through the release aminobutyric acid (GABA), as a rodent study found that a subset of ipRGCs release GABA at brain targets including the SCN (and intergeniculate leaflet and ventral lateral geniculate nucleus), leading to a reduction in the ability of light to affect pupil size and circadian photoentrainment (Sonoda et al., 2020). Whatever the signalling of ipRGC, our finding over the anterior hypothalamus could correspond to a modification of GABA signalling of the SCN which has been reported to have excitatory properties, such that the BOLD signal changes we report may correspond to a reduction in excitation arising in part from the SCN (Albers et al., 2017).”

      (3) Figures 1 and 2 were modified. We hope their quality is now satisfactory. We are willing to provide separate figures prior to publication of the Version of Record.

      Reviewer #2 (Public Review): 

      Summary 

      The interplay between environmental factors and cognitive performance has been a focal point of neuroscientific research, with illuminance emerging as a significant variable of interest. The hypothalamus, a brain region integral to regulating circadian rhythms, sleep, and alertness, has been posited to mediate the effects of light exposure on cognitive functions. Previous studies have illuminated the role of the hypothalamus in orchestrating bodily responses to light, implicating specific neural pathways such as the orexin and histamine systems, which are crucial for maintaining wakefulness and processing environmental cues. Despite advancements in our understanding, the specific mechanisms through which varying levels of light exposure influence hypothalamic activity and, in turn, cognitive performance, remain inadequately explored. This gap in knowledge underscores the need for high-resolution investigations that can dissect the nuanced impacts of illuminance on different hypothalamic regions. Utilizing state-of-the-art 7 Tesla functional magnetic resonance imaging (fMRI), the present study aims to elucidate the differential effects of light on the hypothalamic dynamics and establish a link between regional hypothalamic activity and cognitive outcomes in healthy young adults. By shedding light on these complex interactions, this research endeavours to contribute to the foundational knowledge necessary for developing innovative therapeutic strategies aimed at enhancing cognitive function through environmental modulation. 

      Strengths: 

      (1) Considerable Sample Size and Detailed Analysis: The study leverages a robust sample size and conducts a thorough analysis of hypothalamic dynamics, which enhances the reliability and depth of the findings. 

      (2) Use of High-Resolution Imaging: Utilizing 7 Tesla fMRI to analyze brain activity during cognitive tasks offers high-resolution insights into the differential effects of illuminance on hypothalamic activity, showcasing the methodological rigor of the study. 

      (3) Novel Insights into Illuminance Effects: The manuscript reveals new understandings of how different regions of the hypothalamus respond to varying illuminance levels, contributing valuable knowledge to the field. 

      (4) Exploration of Potential Therapeutic Applications: Discussing the potential therapeutic applications of light modulation based on the findings suggests practical implications and future research directions. 

      Weaknesses: 

      (1) Foundation for Claims about Orexin and Histamine Systems: The manuscript needs to provide a clearer theoretical or empirical foundation for claims regarding the impact of light on the orexin and histamine systems in the abstract. 

      (2) Inclusion of Cortical Correlates: While focused on the hypothalamus, the manuscript may benefit from discussing the role of cortical activation in cognitive performance, suggesting an opportunity to expand the scope of the manuscript. 

      (3) Details of Light Exposure Control: More detailed information about how light exposure was controlled and standardized is needed to ensure the replicability and validity of the experimental conditions. 

      (4) Rationale Behind Different Exposure Protocols: To clarify methodological choices, the manuscript should include more in-depth reasoning behind using different protocols of light exposure for executive and emotional tasks. 

      Reviewer #2 (Recommendations For The Authors): 

      Attention to English language precision and correction of typographical errors, such as "hypothalamic nuclei" instead of "hypothalamus nuclei," is necessary for enhancing the manuscript.

      We thank the reviewer for recognising the interest and strength of our study.

      (1) As detailed in the discussion, we do believe orexin and histamine are excellent candidates for mediating the results we report. As also pointing out, however, we are in no position to know which neurons, nuclei, neurotransmitter and neuromodulator underlie the results. The last sentence of the abstract (PAGE 2) was therefore removed as we agree the statement was too strong. We carefully reconsider the discussion and believe that no such overstatement was present.

      (2) Hypothalamus nuclei are connected to multiple cortical (and subcortical) structures. The relevance of these projections will vary with the cognitive task considered. In addition, we have not yet considered the cortex in our analyses such that truly integrating cortical structures appears premature. 

      We nevertheless added the following short statement (PAGE 11): “Subcortical structures, and particularly those receiving direct retinal projections, including those of the hypothalamus, are likely to receive light illuminance signal first before passing on the light modulation to the cortical regions involved in the ongoing cognitive process (Campbell et al., 2023).”

      (3) We now include the following as part of the method section (PAGES 16-17): “Illuminance and spectra could not be directly measured within the MRI scanner due to the ferromagnetic nature of measurement systems. The coil of the MRI and the light stand, together with the lighting system were therefore placed outside of the MR room to reproduce the experimental conditions of the in a completely dark room. A sensor was placed 2 cm away from the mirror of the coil that is mounted at eye level, i.e. where the eye of the first author of the paper would be positioned, to measure illuminance and spectra. The procedure was repeated 4 times for illuminance and twice for spectra and measurements were averaged. This procedure does not take into account interindividual variation in head size and orbit shape such that the reported illuminance levels may have varied slightly across subjects. The relative differences between illuminance are, however, very unlikely to vary substantially across participants such that statistics consisting of tests for the impact of relative differences in illuminance were not affected. The detailed values reported in Supplementary Table 2 were computed combining spectra and illuminance using the excel calculator associated with a published work (Lucas et al., 2014).”

      (4) The explanation regarding the choice of the illuminance is now included in the revised manuscript (PAGE 17): “Blue-enriched light illuminances were set according to the technical characteristics of the light source and to keep the overall photon flux similar to prior 3T MRI studies of our team (between ~1012 and 1014 ph/cm²/s) (Vandewalle et al., 2010, 2011). The orange light was introduced as a control visual stimulation for potential secondary whole-brain analyses. For the present region of interest analyses, we discarded colour differences between the light conditions and only considered illuminance as indexed by mel EDI lux. This constitutes a limitation of our study as it does not allow attributing the findings to a particular photoreceptor class.”

      (5) The manuscript was thoroughly rechecked, and we hope to have spotted all typos and language errors.

      Reviewer #3 (Public Review): 

      Summary: 

      Campbell and colleagues use a combination of high-resolution fMRI, cognitive tasks, and different intensities of light illumination to test the hypothesis that the intensity of illumination differentially impacts hypothalamic substructures that, in turn, promote alterations in arousal that affect cognitive and affective performance. The authors find evidence in support of a posterior-to-anterior gradient of increased blood flow in the hypothalamus during task performance that they later relate to performance on two different tasks. The results provide an enticing link between light levels, hypothalamic activity, and cognitive/affective function, however, clarification of some methodological choices will help to improve confidence in the findings. 

      Strengths: 

      * The authors' focus on the hypothalamus and its relationship to light intensity is an important and understudied question in neuroscience. 

      Weaknesses: 

      (1) I found it challenging to relate the authors' hypotheses, which I found to be quite compelling, to the apparatus used to test the hypotheses - namely, the use of orange light vs. different light intensities; and the specific choice of the executive and emotional tasks, which differed in key features (e.g., block-related vs. event-related designs) that were orthogonal to the psychological constructs being challenged in each task. 

      (4) Given the small size of the hypothalamus and the irregular size of the hypothalamic parcels, I wondered whether a more data-driven examination of the hypothalamic time series would have provided a more parsimonious test of their hypothesis. 

      Reviewer #3 (Recommendations For The Authors): 

      (1) The authors may wish to explain the importance of the orange light condition in the early section of the results -- i.e., when they first present the task structure. As it stands, I don't have a good appreciation of why the orange light was included -- was it a control condition? And if the differences between the light conditions (e.g., the narrow- vs. wide-band of light) were indeed ignored by focussing on the illuminance levels, are there any potential issues that the authors could then mitigate against with further experiments/analyses? 

      (2) Are there other explanations for why illuminance levels might improve cognitive performance? For instance, the capacity to more easily perceive the stimuli in an experiment could plausibly make it easier to complete a given task. If this is the case, can the authors conceptualise a way to rule out this hypothesis? 

      (3) Did the authors control for the differences in the number of voxels in each hypothalamic subregion? Or perhaps consider estimating the variance across voxels within the larger parcels, to determine whether the mean time series was comparable to the time series of the smaller parcels? 

      (4) An alternative strategy that would mitigate against the differences in the size of hypothalamic parcels would be to conduct analyses on the hypothalamus without parcellation, but instead using dimensionality reduction techniques to observe the natural spread of responses across the hypothalamus. From the authors' results, my intuition is that these analyses will lead to similar conclusions, albeit without any of the potential issues with respect to differently-sized parcels. 

      We thank the reviewer for acknowledging the originality and interest of our study. We agree that some methodological choices needed more explanation. We will address the weaknesses they pointed out as follows:

      (1) The explanation regarding the choice of the illuminance is now included in the revised manuscript (PAGE 17): “Blue-enriched light illuminances were set according to the technical characteristics of the light source and to keep the overall photon flux similar to prior 3T MRI studies of our team (between ~1012 and 1014 ph/cm²/s) (Vandewalle et al., 2010, 2011). The orange light was introduced as a control visual stimulation for potential secondary whole-brain analyses. For the present region of interest analyses, we discarded colour differences between the light conditions and only considered illuminance as indexed by mel EDI lux. This constitutes a limitation of our study as it does not allow attributing the findings to a particular photoreceptor class.”

      The revised discussion makes clear that these choices limit the interpretation about the photoreceptors involved (PAGE 12-13): “We based our rationale and part of our interpretations on ipRGC projections, which have been demonstrated in rodents to channel the NIF biological impact of light and incorporate the inputs from rods and cones with their intrinsic photosensitivity into a light signal that can impact the brain (Güler et al., 2008; Tri & Do, 2019). Given the polychromatic nature of the light we used, classical photoreceptors and their projections to visual brain areas are, however, very likely to have directly or indirectly contributed to the modulation by light of the regional activity of the hypothalamus.”

      We further mention that (PAGE 13): “Furthermore, we cannot exclude that colour and/or spectral differences between the orange and 3 blue-enriched light conditions may have contributed to our findings. Research in rodent model demonstrated that variation in the spectral composition of light was perceived by the suprachiasmatic nucleus to set circadian timing (Walmsley et al., 2015). No such demonstration has, however, been reported yet for the acute impact of light on alertness, attention, cognition or affective state.”

      Regarding the choice of tasks, we added the following the method section (PAGE 18): “Prior work of our team showed that the n-back task and emotional task included in the present protocol were successful probes to demonstrate that light illuminance modulates cognitive activity, including within subcortical structures (though resolution did not allow precise isolation of nuclei or subparts) (e.g. (Vandewalle et al., 2007, 2010)). When taking the step of ultra-high-field imaging, we therefore opted for these tasks as our goal was to show that illuminance affects brain activity across cognitive domains while not testing for task-specific aspects of these domains.”

      We further added to the discussion (PAGE 8): “The pattern of light-induced changes was consistent across an executive and an emotional task which consisted of block and an event-related fMRI design, respectively. This suggests that a robust anterior-posterior gradient of activity modulation by illuminance is present in hypothalamus across cognitive domains.”

      (2) We are unsure what the reviewer refers to when he states that the experiment could make it easier to perceive a stimulus. Aside from the fact that illuminance can increase alertness and attention such that a stimulus may be better or more easily perceived/processed, we do not see how blocks of ambient light, i.e. a long-lasting visual stimulus, may render auditory stimulation (letters or pseudo-words in the present) easier to perceive. To our knowledge multimodal or cross-modal integration has been robustly demonstrated for short visual/auditory cues that would precede or accompany auditory/visual stimulation. 

      We are willing to clarify this issue in the text if we receive additional explanation from the reviewer.

      (3) We added subpart size as covariate in the analyses (instead of subpart number) and it did not affect the output of the statistical analyses (Author response table 1). 

      For completeness, we further computed standard deviation of the activity estimates of the voxels within each parcel for the main analysis of the n-back tasks and found a main effect of subpart (Author response table 2) indicating that the variability of the estimates varied across subparts. Post hoc contrast and the display included in Author response image1 show however that the difference were not related to subpart size per see. It is in fact the largest subpart (subpart 4) that shows the largest variability while one of the smallest subpart (subpart 2) shows the lowest variability. Though it may have contributed, it is therefore unlikely to explain our findings. We consider the analyses reported in (Author response table 1 and 2 and (Author response image 1 as very technical and did not include it in the supplementary material for conciseness. If the reviewer judges it essential, we can reconsider our decision.  

      While computing these analyses, we realized that there were errors in the table 1 reporting the statistical outcomes of the main analyses of the emotional task. The main statistical outputs remain the same except for a nominal main effect of the task (emotional vs. neutral) and the fact that post hoc show a consistent difference between the posterior subpart (subpart 3) and all the other subparts, rather than all the other subparts except for the difference with superior tubular hypothalamus subpart: p-corrected = 0.09. We apologise for this slight error and were unable to isolate its origin. It does not modify the rest of the analyses (which were also rechecked) and the interpretations. 

      Author response table 1.

      Recomputations of the main GLMMs using subpart sizes rather than subpart numbers as covariate of interest.

      Author response image 1.

      Activity estimate variability per hypothalamus subpart and subpart size.  

      Author response table 2.

      Difference in activity estimate standard deviation between hypothalamus subparts during the n-back task.

      Outputs of the generalized linear mixed model (GLMM) with subject as the random factor (intercept and slope), and task and subpart as repeated measures (ar(1) autocorrelation).

      * The corrected p-value for multiple comparisons over 2 tests is p < 0.025.

      # Refer to Fig.2A for correspondence of subpart numbers

      The text referring to Table 1 was modified accordingly (PAGE 5): “A nominal main effect of the task was detected for the emotional task [p = 0.049; Table 1] but not for the n-back task. For both tasks, there was no significant main effect for any of the other covariates and post hoc analyses showed that the index of the illuminance impact was consistently different in the posterior hypothalamus subpart compared to the other subparts [pcorrected ≤ 0.05]”.

      (4) We agree that a data driven approach could have constituted an alternative means to tests our hypothesis. We opted for an approach that we mastered best, while still allowing to conclusively test for regional differences in activity across the hypothalamus. Examination of time series of the very same data we used will mainly confirm the results of our analyses – an anterior-posterior gradient in the impact of illuminance - while it may yield slight differences in the boarders of the subparts of the hypothalamus undergoing decreased or increased activity with increasing illuminance. While the suggested approach may have been envisaged if we had been facing negative results (i.e. no differences between subparts, potentially because subparts would not reflect functional differences in response to illuminance change), it would constitute a circular confirmation of our main findings (i.e. using the same data). While we truly appreciate the suggestion, we do not consider that it would constitute a more parsimonious test of our hypothesis, now that we successfully applied GLM/parcellation and GLMM approaches.

      We added the following statement to the discussion to take this comment into account (PAGE 12): “Future research may consider data-driven analyses of hypothalamus voxels time series as an alternative to the parcellation approach we adopted here. This may refine the delineation of the subparts of the hypothalamus undergoing decreased or increased activity with increasing illuminance.”

      Response references

      Albers, H. E., Walton, J. C., Gamble, K. L., McNeill, J. K., & Hummer, D. L. (2017). The dynamics of GABA signaling: Revelations from the circadian pacemaker in the suprachiasmatic nucleus. Frontiers in Neuroendocrinology, 44, 35–82. https://doi.org/10.1016/J.YFRNE.2016.11.003

      Bano-Otalora, B., Martial, F., Harding, C., Bechtold, D. A., Allen, A. E., Brown, T. M., Belle, M. D. C., & Lucas, R. J. (2021). Bright daytime light enhances circadian amplitude in a diurnal

      mammal. Proceedings of the National Academy of Sciences of the United States of America, 118(22), e2100094118. https://doi.org/10.1073/PNAS.2100094118/SUPPL_FILE/PNAS.2100094118.SAPP.PDF

      Campbell, I., Sharifpour, R., & Vandewalle, G. (2023). Light as a Modulator of Non-Image-Forming Brain Functions Positive and Negative Impacts of Increasing Light Availability. Clocks & Sleep, 5(1), 116. https://doi.org/10.3390/CLOCKSSLEEP5010012

      Chellappa, S. L., Ly, J. Q. M., Meyer, C., Balteau, E., Degueldre, C., Luxen, A., Phillips, C., Cooper, H. M., & Vandewalle, G. (2014). Photic memory for executive brain responses. Proceedings of the National Academy of Sciences of the United States of America, 111(16), 6087–6091. https://doi.org/10.1073/pnas.1320005111

      Dijk, D. J., Duffy, J. F., Silva, E. J., Shanahan, T. L., Boivin, D. B., & Czeisler, C. A. (2012). Amplitude reduction and phase shifts of melatonin, cortisol and other circadian rhythms after a gradual advance of sleep and light exposure in humans. PloS One, 7(2). https://doi.org/10.1371/JOURNAL.PONE.0030037

      Güler, A. D., Ecker, J. L., Lall, G. S., Haq, S., Altimus, C. M., Liao, H. W., Barnard, A. R., Cahill, H., Badea, T. C., Zhao, H., Hankins, M. W., Berson, D. M., Lucas, R. J., Yau, K. W., & Hattar, S. (2008). Melanopsin cells are the principal conduits for rod-cone input to non-image-forming vision. Nature, 453(7191), 102–105. https://doi.org/10.1038/nature06829

      Lucas, R. J., Peirson, S. N., Berson, D. M., Brown, T. M., Cooper, H. M., Czeisler, C. A., Figueiro, M. G., Gamlin, P. D., Lockley, S. W., O’Hagan, J. B., Price, L. L. A., Provencio, I., Skene, D. J., & Brainard, G. C. (2014). Measuring and using light in the melanopsin age. Trends in Neurosciences, 37(1), 1–9. https://doi.org/10.1016/j.tins.2013.10.004

      Milosavljevic, N., Cehajic-Kapetanovic, J., Procyk, C. A., & Lucas, R. J. (2016). Chemogenetic Activation of Melanopsin Retinal Ganglion Cells Induces Signatures of Arousal and/or Anxiety in Mice. Current Biology, 26(17), 2358–2363. https://doi.org/10.1016/j.cub.2016.06.057

      Sonoda, T., Li, J. Y., Hayes, N. W., Chan, J. C., Okabe, Y., Belin, S., Nawabi, H., & Schmidt, T. M. (2020). A noncanonical inhibitory circuit dampens behavioral sensitivity to light. Science (New York, N.Y.), 368(6490), 527–531. https://doi.org/10.1126/SCIENCE.AAY3152

      Tri, M., & Do, H. (2019). Melanopsin and the Intrinsically Photosensitive Retinal Ganglion Cells: Biophysics to Behavior. Neuron, 104, 205–226. https://doi.org/10.1016/j.neuron.2019.07.016

      Vandewalle, G., Hébert, M., Beaulieu, C., Richard, L., Daneault, V., Garon, M. Lou, Leblanc, J., Grandjean, D., Maquet, P., Schwartz, S., Dumont, M., Doyon, J., & Carrier, J. (2011). Abnormal hypothalamic response to light in seasonal affective disorder. Biological Psychiatry, 70(10), 954–961. https://doi.org/10.1016/j.biopsych.2011.06.022

      Vandewalle, G., Schmidt, C., Albouy, G., Sterpenich, V., Darsaud, A., Rauchs, G., Berken, P. Y., Balteau, E., Dagueldre, C., Luxen, A., Maquet, P., & Dijk, D. J. (2007). Brain responses to violet, blue, and green monochromatic light exposures in humans: Prominent role of blue light and the brainstem. PLoS ONE, 2(11), e1247. https://doi.org/10.1371/journal.pone.0001247

      Vandewalle, G., Schwartz, S., Grandjean, D., Wuillaume, C., Balteau, E., Degueldre, C., Schabus, M., Phillips, C., Luxen, A., Dijk, D. J., & Maquet, P. (2010). Spectral quality of light modulates emotional brain responses in humans. Proceedings of the National Academy of Sciences of the United States of America, 107(45), 19549–19554. https://doi.org/10.1073/pnas.1010180107

      Viénot, F., Brettel, H., Dang, T.-V., & Le Rohellec, J. (2012). Domain of metamers exciting intrinsically photosensitive retinal ganglion cells (ipRGCs) and rods. Journal of the Optical Society of America A, 29(2), A366. https://doi.org/10.1364/josaa.29.00a366

      Walmsley, L., Hanna, L., Mouland, J., Martial, F., West, A., Smedley, A. R., Bechtold, D. A., Webb, A. R., Lucas, R. J., & Brown, T. M. (2015). Colour As a Signal for Entraining the Mammalian Circadian Clock. PLOS Biology, 13(4), e1002127. https://doi.org/10.1371/journal.pbio.1002127

    2. eLife assessment

      This fundamental work describes the complex interplay between light exposure, hypothalamic activity, and cognitive function. The evidence supporting the conclusion is compelling with potential therapeutic applications of light modulation. The work will be of broad interest to basic and clinical neuroscientists.

    3. Reviewer #1 (Public Review):

      Summary:

      Campbell et al investigated the effects of light on the human brain, in particular the subcortical part hypothalamus during auditory cognitive tasks. The mechanisms and neuronal circuits underlying light effects in non-image forming responses are so far mostly studied in rodents but are not easily translated in humans. Therefore, this is a fundamental study aiming to establish the impact light illuminance has on the subcortical structures using the high-resolution 7T fMRI. The authors found that parts of the hypothalamus are differently responding to illuminance. In particular, they found that the activity of the posterior hypothalamus increases while the activity of the anterior and ventral parts of the hypothalamus decreases under high illuminance. The authors also report that the performance of the 2-back executive task was significantly better in higher illuminance conditions. However, it seems that the activity of the posterior hypothalamus subpart is negatively related to the performance of the executive task, implying that it is unlikely that this part of the hypothalamus is directly involved in the positive impact of light on performance observed. Interestingly, the activity of the posterior hypothalamus was, however, associated with an increased behavioural response to emotional stimuli. This suggests that the role of this posterior part of the hypothalamus is not as simple regarding light effects on cognitive and emotional responses. This study is a fundamental step towards our better understanding of the mechanisms underlying light effects on cognition and consequently optimising lighting standards.

      Strengths:

      While it is still impossible to distinguish individual hypothalamic nuclei, even with the high-resolution fMRI, the authors split the hypothalamus into five areas encompassing five groups of hypothalamic nuclei. This allowed them to reveal that different parts of the hypothalamus respond differently to an increase in illuminance. They found that higher illuminance increased the activity of the posterior part of the hypothalamus encompassing the MB and parts of the LH and TMN, while decreasing the activity of the anterior parts encompassing the SCN and another part of TMN. These findings are somewhat in line with studies in animals. It was shown that parts of the hypothalamus such as SCN, LH, and PVN receive direct retinal input in particular from ipRGCs. Also, acute chemogenetic activation of ipRGCs was shown to induce activation of LH and also increased arousal in mice.

      Weaknesses:

      While the light characteristics are well documented and EDI calculated for all of the photoreceptors, it is not very clear why these irradiances and spectra were chosen. It would be helpful if the authors explained the logic behind the four chosen light conditions tested. Also, the lights chosen have cone-opic EDI values in a high correlation with the melanopic EDI, therefore we can't distinguish if the effects seen here are driven by melanopsin and/or other photoreceptors. In order to provide a more mechanistic insight into the light-driven effects on cognition ideally one would use silent substitution approach to distinguish between different photoreceptors. This may be something to consider when designing the follow-up studies.

    4. Reviewer #2 (Public Review):

      Summary

      The interplay between environmental factors and cognitive performance has been a focal point of neuroscientific research, with illuminance emerging as a significant variable of interest. The hypothalamus, a brain region integral to regulating circadian rhythms, sleep, and alertness, has been posited to mediate the effects of light exposure on cognitive functions. Previous studies have highlighted the role of the hypothalamus in orchestrating bodily responses to light, implicating specific neural pathways such as the orexin and histamine systems, which are crucial for maintaining wakefulness and processing environmental cues. Despite advancements in our understanding, the specific mechanisms through which varying levels of light exposure influence hypothalamic activity and, in turn, cognitive performance, remain inadequately explored. This gap in knowledge underscores the need for high-resolution investigations that can dissect the nuanced impacts of illuminance on different hypothalamic regions. Utilizing state-of-the-art 7 Tesla functional magnetic resonance imaging (fMRI), the present study aims to elucidate the differential effects of light on hypothalamic dynamics and establish a link between regional hypothalamic activity and cognitive outcomes in healthy young adults. By shedding light on these complex interactions, this research endeavours to contribute to the foundational knowledge necessary for developing innovative therapeutic strategies aimed at enhancing cognitive function through environmental modulation.

      Strengths:

      (1) Considerable Sample Size and Detailed Analysis: The study leverages a robust sample size and conducts a thorough analysis of hypothalamic dynamics, which enhances the reliability and depth of the findings.<br /> (2) Use of High-Resolution Imaging: Utilizing 7 Tesla fMRI to analyze brain activity during cognitive tasks offers high-resolution insights into the differential effects of illuminance on hypothalamic activity, showcasing the methodological rigour of the study.<br /> (3) Novel Insights into Illuminance Effects: The manuscript reveals new understandings of how different regions of the hypothalamus respond to varying illuminance levels, contributing valuable knowledge to the field.<br /> (4) Exploration of Potential Therapeutic Applications: Discussing the potential therapeutic applications of light modulation based on the findings suggests practical implications and future research directions.

      The current version of the manuscript addresses previous weaknesses, including details about the illuminance levels, light spectral characteristics used in the MRI study, and light patterns during behavioural tasks. The authors effectively tackle open questions in the field and provide solid evidence that enhances our understanding of the mechanisms underlying the effects of light on cognition.

    5. Reviewer #3 (Public Review):

      Summary:

      Campbell and colleagues use a combination of high-resolution fMRI, cognitive tasks and different intensities of light illumination to test the hypothesis that the intensity of illumination differentially impacts hypothalamic substructures that, in turn, promote alterations in arousal that affect cognitive and affective performance. The authors find evidence in support of a posterior-to-anterior gradient of increased blood flow in the hypothalamus during task performance that they later relate to performance on two different tasks. The results provide an enticing link between light levels, hypothalamic activity and cognitive/affective function, however clarification of some methodological choices will help to improve confidence in the findings.

      Strengths:

      * The authors' focus on the hypothalamus and its relationship to light intensity is an important and understudied question in neuroscience.

      Weaknesses:

      * I found it challenging to relate the authors hypotheses, which I found to be quite compelling, to the apparatus used to test the hypotheses - namely, the use of orange light vs. different light intensities; and the specific choice of the executive and emotional tasks, which differed in key features (e.g., block-related vs. event-related designs) that were orthogonal to the psychological constructs being challenged in each task.

      * Given the small size of the hypothalamus and the irregular size of the hypothalamic parcels, I wondered whether a more data-driven examination of the hypothalamic time series would have provided a more parsimonious test of their hypothesis.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      The current study aims to quantify associations between the regular use of proton-pump inhibitors (PPI) - defined as using PPI most days of the week during the last 4 weeks at one cross-section in time - with several respiratory outcomes up to several years later in time. There are 6 respiratory outcomes included: risk of influenza, pneumonia, COVID-19, other respiratory tract infections, as well as COVID-19 severity and mortality).

      Strengths:

      Several sensitivity analyses were performed, including i) estimation of the e-value to assess how strong unmeasured confounders should be to explain observed effects, ii) comparison with another drug with a similar indication to potentially reduce (but not eliminate) confounding by indication.

      We are grateful for your pointing out the strengths in our article, particularly the assessment of e-values and the comparison with another medication to mitigate confounding by indication. We extend our sincere gratitude to the reviewer for identifying multiple concerns and offering constructive feedback to help improve our manuscript. We will incorporate these suggestions into our revisions.

      Weaknesses:

      (1) The main exposure of interest seems to be only measured at one time-point in time (at study enrollment) while patients are considered many years at risk afterwards without knowing their exposure status at the time of experiencing the outcome. As indicated by the authors, PPI are sometimes used for only short amounts of time. It seems biologically implausible that an infection was caused by using PPI for a few weeks many years ago.

      We agree with the reviewer that PPIs are sometimes used for only short amounts of time, as indicated in our manuscript. We acknowledge that it is a limitation of the UK Biobank cohort, and we have discussed this in the discussion section as follows:

      “Given that the PPI exposure was mainly assessed at the baseline recruitment, it was possible that a small proportion of PPI users was misclassified during the follow-up due to the medication discontinuation, which may result in an underestimation of potential risk.” (Page 14, Line 8-10)

      In addition, to alleviate these concerns, we have conducted effect medication for the subgroup of potential long-term users, which were defined by participants with indications of PPI use. This information has been included in the discussion section:

      “In addition, no effect moderation was observed in subgroup analyses for the main outcome among PPI users with indications (more likely to regularly use PPIs for a long period) compared to those without indications, indicating the risks remained increased among long-term PPI users.” (Page 14, Line 12-15)

      We hope that in the future, the concerns highlighted by the reviewer can be resolved by utilizing datasets with close follow-up, especially regarding medication use:

      “Since the follow-up prescription data was lacking in our study to precisely identifying the long-term users, further evaluation using cohorts with close follow-up is needed.” (Page 14, Line 15-17)

      (2) Previous studies have shown that by focusing on prevalent users of drugs, one often induces several biases such as collider stratification bias, selection bias through depletion of susceptible, etc.

      Because of the limitations of data from the UK Biobank, such as the absence of details on initiation of medications and regular monitoring, we were restricted to using a prevalent user design to assess the associations between PPI use and respiratory outcomes. We have discussed it in the limitation section:

      “Given that the PPI exposure was mainly assessed at the baseline recruitment, it was possible that a small proportion of PPI users was misclassified during the follow-up due to the medication discontinuation, which may result in an underestimation of potential risk. However, the prevalent user design could underestimate the actual risks of PPI use for respiratory infections, which indicates the real effect might be stronger [38]……Since the follow-up prescription data was lacking in our study to precisely identifying the long-term users, further evaluation using cohorts with close follow-up is needed.” (Page 14, Line 8-17)

      (3) It seems Kaplan Meier curves are not adjusted for confounding through e.g. inverse probability weighting. As such the KM curves are currently not informative (or the authors need to make clearer that curves are actually adjusted for measured confounding).

      Your kind suggestions are greatly appreciated. We have plotted Kaplan Meier curves adjusted for confounding by inverse probability weighting with the measured confounders according to the reviewer’s advice. The methods and results are demonstrated as follows:

      “The event-free probabilities were compared by Kaplan-Meier survival curves with inverse probability weights adjusting for the measured covariates.” (Page 8, Line 13-15)

      “Regular PPI users had lower event-free probabilities for influenza and pneumonia compared to those of non-users (Supplementary Figure 2 A-B).” (Page 9, Line 21-23)

      “PPI users had lower event-free probabilities for COVID-19 severity and mortality, but not COVID-19 positivity compared to those of non-users (Supplementary Figure 2 C-E).” (Page 10, Line 9-10)

      (4) Throughout the manuscript the authors seem to misuse the term multivariate (using one model with e.g. correlated error terms to assess multiple outcomes at once) when they seem to mean multivariable.

      We apologize for misusing the term “multivariate” and “multivariable” in our previous manuscript. We have corrected the misused terms throughout the manuscript:

      “Univariate and multivariable Cox proportional hazards regression models were utilized to assess the association between regular use of PPIs and the selected outcomes.” (Page 7, Line 19-20)

      “The remaining imbalanced covariates (standardized mean difference ≥ 0.1) after propensity score matching were further adjusted by multivariate multivariable Cox regression models to calculate HRs and 95% CIs.” (Page 8, Line 23-25)

      (5) Given multiple outcomes are assessed there is a clear argument for accounting for multiple testing, which following the logic of the authors used in terms of claiming there is no association when results are not significant may change their conclusions. More high-level, the authors should avoid the pitfall of stating there is evidence of absence if there is only an absence of evidence in a better way (no statistically significant association doesn't mean no relationship exists).

      We have revised our interpretation for the results, particularly for those without statically significant association based on the reviewer’s advice, and clearly recognize that the conclusions should be interpreted with cautions:

      “In contrast, the risk of COVID-19 infection was not significant with regular PPI use…” (Page 2, Line 11-12)

      “PPI users were associated with a higher risk of influenza (HR 1.74, 95%CI 1.19-2.54), but the risks with pneumonia or COVID-19-related outcomes were not evident.” (Page 2, Line 14-16)

      “…while the effects on pneumonia or COVID-19-related outcomes under PPI use were attenuated when compared to the use of H2RAs.” (Page 2, Line 18-19, in the Abstract)

      “…while their association with pneumonia and COVID-19-related outcomes is diminished after comparison with H2RA use and remains to be further explored.” (Page 15, Line 21-22, in the Conclusion)

      (6) While the authors claim that the quantitative bias analysis does show results are robust to unmeasured confounding, I would disagree with this. The e-values are around 2 and it is clearly not implausible that there are one or more unmeasured risk factors that together or alone would have such an effect size. Furthermore, if one would use the same (significance) criteria as used by the authors for determining whether an association exists, the required effect size for an unmeasured confounder to render effects 'statistically non-significant' would be even smaller.

      We agree with the reviewer that there might still exist one or more unmeasured risk factors that have effect sizes larger than 2. Hence, we cannot affirm that the findings are robust to unmeasured confounding in the current analysis, which is a limitation of our study. We have deleted the previous statement, and added more discussion in the limitation section:

      “Moreover, patients with exacerbations of respiratory disorders (e.g., asthma, COPD) might suffer from a wide range of gastrointestinal symptoms that lead to the use of PPIs [38]. Due to the lack of data for respiratory severity and close follow-up for medication use, residual confounding might still exist due to the observational nature.” (Page 14, Line 23-27)

      (7) Some patients are excluded due to the absence of follow-up, but it is unclear how that is determined. Is there potentially some selection bias underlying this where those who are less healthy stop participating in the UK biobank?

      Thank you for your question. The reasons for the absence of follow-up are mainly classified into five categories, including: (1) Death reported to UK Biobank by a relative; (2) NHS records indicate they are lost to follow-up; (3) NHS records indicate they have left the UK; (4) UK Biobank sources report they have left the UK; (5) Participant has withdrawn consent for future linkage. According to the data from UK Biobank (https://biobank.ndph.ox.ac.uk/ showcase/field.cgi?id=190), the major reason for the loss of follow-up among participants is their departure from the UK (84.7% of participants who were lost to follow-up). In addition, not including those who were less healthy in the study might also underestimate the risk, leading to lower estimated effects of PPIs for respiratory infections. We have supplemented this in our revised manuscript:

      “Among them, 1,297 participants without follow-up, which were mainly determined by reported death, departure from the UK, or withdrawn consent, had been removed after initial exclusion.” (Page 4, Line 25-27)

      (8) Given that the exposure is based on self-report how certain can we be that patients e.g. do know that their branded over-the-counter drugs are PPI (e.g. guardium tablets)? Some discussion around this potential issue is lacking.

      Thank you for your concerns. In the data collection by the UK Biobank, the participants can enter the generic or trade name of the treatment on the touchscreen to match the medications they used. We have added this important information to the method section:

      “The exposure of interest was regular use of PPIs. The participants could enter the generic or trade name of the treatment on the touchscreen to match the medications they used (Supplementary Table S1).” (Page 5, Line 6-8)

      We acknowledge that specific information on prescribed or over-the-counter use of medications is lacking in the UK Biobank. We have discussed it in the limitation section:

      “Limitations exist in our study. Information on dose and duration of PPI use, discrimination between prescription and over-the-counter use of PPIs, health-seeking behavior, different types of pneumonia, and pneumococcus vaccination is currently not available from the UK Biobank.” (Page 14, Line 5-8)

      (9) Details about the deprivation index are needed in the main text as this is a UK-specific variable that will be unfamiliar to most readers.

      Thank you for your question on the definition of deprivation index. We have proved the details  about the deprivation index in the manuscript:

      “…socioeconomic status (deprivation index, which was defined using national census information on car ownership, household overcrowding, owner occupation, and unemployment combined for postcode areas of residence)…” (Page 6, Line 14-17)

      (10) It is unclear how variables were coded/incorporated from the main text. More details are required, e.g. was age included as a continuous variable and if so was non-linearity considered and how?

      We apologize for not elucidating how variables were incorporated into the main text. Previously, the linearity between continuous variables and outcomes was assessed by Martingale residuals plots, while the variables detected with non-linearity were regarded as categorical variables for further analyses. For example, after evaluation with the Martingale residuals plot, age demonstrated non-linearity, and we incorporated it as a categorical variable for the analysis of COVID-related mortality.

      We have supplemented the information in the method section:

      “The linearity between continuous variables and outcomes was assessed by Martingale residuals plots, while the variables detected with non-linearity were regarded as categorical variables for further analyses.” (Page 6, Line 28 to Page 7, Line 1)

      (11) The authors state that Schoenfeld residuals were tested, but don't report the test statistics. Could they please provide these, e.g. it would already be informative if they report that all p-values are above a certain value.

      We are sorry for not providing the statistics about the Schoenfeld residual in our previous manuscript. We have supplemented the information in our revisions:

      “Schoenfeld residuals tests were used to evaluate the proportional hazards assumptions, while no violation of the assumption was detected (Supplementary Table S3).” (Page 7, Line 27 to Page 8, Line 1)

      (12) The authors would ideally extend their discussion around unmeasured confounding, e.g. using the DAGs provided in https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7832226/, in particular (but not limited to) around severity and not just presence/absence of comorbidities.

      Thank you for your insightful suggestions that the discussion about unmeasured confounding should be extended. We agree with the reviewer that, in addition to the comorbidities themselves, their severity could also have an important impact on the use of PPIs. We have added the discussion in the limitation section with citing the article (PMC7832226):

      “Moreover, patients with exacerbations of comorbid disorders (e.g., diabetes, asthma, COPD) might suffer from a wide range of gastrointestinal symptoms that lead to the use of PPIs [38] (Supplementary Figure S4). Due to the lack of data for respiratory severity and close follow-up for medication use, residual confounding might still exist due to the observational nature.” (Page 14, Line 23-27)

      (13) The UK biobank is known to be highly selected for a range of genetic, behavioural, cardiovascular, demographic, and anthropometric traits. The potential problems this might create in terms of collider stratification bias - as highlighted here for example: https://www.nature.com/articles/s41467-020-19478-2 - should be discussed in greater detail and also appreciated more when providing conclusions.

      We acknowledge the reviewer's point about the UK Biobank's highly selective nature potentially leading to collider stratification bias in the evaluation of COVID-19-related outcomes. We have discussed this in detail and are cautious when generating conclusions.

      “Furthermore, the highly selective nature of the UK Biobank might create collider stratification bias for the evaluation of COVID-19-related outcomes, and thus the conclusions should be interpreted with cautions [39].” (Page 15, Line 2-4)

      Reviewer #2 (Public Review):

      Summary:

      Zeng et al investigate in an observational population-based cohort study whether the use of proton pump inhibitors (PPIs) is associated with an increased risk of several respiratory infections among which are influenza, pneumonia, and COVID-19. They conclude that compared to non-users, people regularly taking PPIs have increased susceptibility to influenza, pneumonia, as well as COVID-19 severity and mortality. By performing several different statistical analyses, they try to reduce bias as much as possible, to end up with robust estimates of the association.

      Strengths:

      The study comprehensively adjusts for a variety of critical covariates and by using different statistical analyses, including propensity-score-matched analyses and quantitative bias analysis, the estimates of the associations can be considered robust.

      We are grateful to the reviewer for pointing out the merits of our articles, which include adjusting for a wide range of covariates, employing diverse statistical analyses, and using robust data. We will revise our manuscript further based on the reviewer's suggestions.

      Weaknesses:

      As it is an observational cohort study there still might be bias. Information on the dose or duration of acid suppressant use was not available, but might be of influence on the results. The outcome of interest was obtained from primary care data, suggesting that only infections as diagnosed by a physician are taken into account. Due to the self-limiting nature of the outcome, differences in health-seeking behavior might affect the results.

      Thank you for your questions for information on the dose/duration of acid suppressants, the source of diagnosis, and the health-seeking behavior of participants. For the data from the UK Biobank, the dose or duration of acid suppressant use was not available since the information was not collected as baseline or follow-up. In addition, the outcome of interest was also retrieved from the hospital ICD diagnosis. We apologize for not clarifying it in our previous manuscript. Moreover, we agree with the reviewer that the health-seeking behavior could have an impact on the analyses, whereas the correlated data are still not available from the UK Biobank. We have discussed them in the method and limitation section:

      “Briefly, the first reported occurrences of respiratory system-related conditions within primary care data,  and hospital inpatient data defined by the International Classification of Diseases (ICD)- 10 codes were categorized by the UK Biobank.” (Page 5, Line 21-25)

      “Limitations exist in our study. Information on dose and duration of PPI use, discrimination between prescription and over-the-counter use of PPIs, health-seeking behavior, different types of pneumonia, and pneumococcus vaccination is currently not available from the UK Biobank.” (Page 14, Line 5-8)

      Reviewer #1 (Recommendations For The Authors):

      Analysis code should be made available.

      Thank you for your question. We have provide the sources of the analysis code we used for this study in our revised manuscript:

      “The codes used in this study can be found at: https://epirhandbook.com/en/ and https://cran.r-project.org/doc/contrib/Epicalc_Book.pdf.” (Page 16, Line 21-22)

      Reviewer #2 (Recommendations For The Authors):

      It might be interesting to study whether including self-reported infections changes the results, as people using PPI may more easily consult their GP even for a self-limiting disease such as influenza and therefore are more likely diagnosed/confirmed with such a respiratory infection.

      Thank you for your insightful suggestions on conducting analyses including self-reported infections. Therefore, we have included the self-reported cases as sensitivity analyses, and the results were not significantly altered, which confirms the robustness of our results:

      “Self-reported infections, except for COVID-19-related outcomes due to the lack of data, were also included for the outcomes as sensitivity analyses. The self-reported cases were reported at the baseline or subsequent UK Biobank assessment center visit.” (Page 8, Line 17-19)

      “Inclusion of the self-reported cases did not significantly alter the results (Supplementary Table S4).” (Page 9, Line 17-18)

      Moreover, to address the above-mentioned, sub-analyses differentiating between over-the-counter and prescribed medication might be interesting.

      Thank you for your questions on differentiating between over-the-counter and prescribed medication. We have thoroughly looked up the data provided by the UK Biobank, but it is a pity that they are not provided. We have discussed this in the limitation section:

      “Information on dose and duration of PPI use, discrimination between prescription and over-the-counter use of PPIs, health-seeking behavior, different types of pneumonia, and pneumococcus vaccination is currently not available from the UK Biobank.” (Page 14, Line 5-8)

    2. eLife assessment

      This useful study aimed to quantify associations between regular use of proton-pump inhibitors (PPI) with the occurrence of respiratory infections, such as influenza, pneumonia, COVID-19, and others over a period of several years. PPI use was associated with increased risks of influenza, pneumonia, but not of COVID-19, although severity and mortality of COVID-19 infections were higher in PPI users. There are inevitable weaknesses of the study design used, such as the fact that PPI use was only measured at one time-point whereas infections were assessed over a long time period, but these are appropriately highlighted in the discussion. Weaknesses are highlighted in the discussion and the study presents convincing evidence for the conclusions overall.

    3. Reviewer #1 (Public Review):

      Summary:

      The current study aims to quantify associations between regular use of proton-pump inhibitors (PPI) - defined as using PPI most days of the week during the last 4 weeks at one cross-section in time - with several respiratory outcomes (6 in total: risk of influenza, pneumonia, COVID-19, other respiratory tract infections, as well as COVID-19 severity and mortality) up to several years later in time.

      Strengths:

      Several sensitivity analyses were performed, including i) estimation of the e-value to assess how strong unmeasured confounders should be to explain observed effects, ii) comparison with another drug with a similar indication to potentially reduce (but not eliminate) confounding by indication, iii)

      Weaknesses:

      While the original submission had several weaknesses, the authors have appropriately addressed all issues raised. There are inevitable weaknesses remaining, but these are appropriately highlighted in the discussion. Remaining weaknesses that remain - but are highlighted in the discussion - include the fact that the main exposure of interest is only measured at one time-point whereas outcomes are assessed over a long time period, the inclusion of prevalent users leading to potential bias (e.g. those experiencing bad outcomes already stopping because of side-effects before inclusion in the study), and the possibility of unmeasured confounding explaining observations (e.g. severity of underlying comorbidities leading to PPI prescriptions combined with the absence of information about comorbidity severity), and potential selection bias.

    1. Reviewer #2 (Public Review):

      This paper examined how the activity of neurons in the entopeduncular nucleus (EPN) of mice relates to kinematics, value, and reward. The authors recorded neural activity during an auditory-cued two-alternative choice task, allowing them to examine how neuronal firing relates to specific movements like licking or paw movements, as well as how contextual factors like task stage or proximity to a goal influence the coding of kinematic and spatiotemporal features. The data shows that the firing of individual neurons is linked to kinematic features such as lick or step cycles. However, the majority of neurons exhibited activity related to both movement types, suggesting that EPN neuronal activity does not merely reflect muscle-level representations. This contradicts what would be expected from traditional action selection or action specification models of the basal ganglia.

      The authors also show that spatiotemporal variables account for more variability compared to kinematic features alone. Using demixed Principal Component Analysis, they reveal that at the population level, the three principal components explaining the most variance were related to specific temporal or spatial features of the task, such as ramping activity as mice approached reward ports, rather than trial outcome or specific actions. Notably, this activity was present in neurons whose firing was also modulated by kinematic features, demonstrating that individual EPN neurons integrate multiple features. A weakness is that what the spatiotemporal activity reflects is not well specified. The authors suggest some may relate to action value due to greater modulation when approaching a reward port, but acknowledge action value is not well parametrized or separated from variables like reward expectation.

      A key goal was to determine whether activity related to expected value and reward delivery arose from a distinct population of EPN neurons or was also present in neurons modulated by kinematic and spatiotemporal features. In contrast to previous studies (Hong & Hikosaka 2008 and Stephenson-Jones et al., 2016), the current data reveals that individual neurons can exhibit modulation by both reward and kinematic parameters. Two potential differences may explain this discrepancy: First, the previous studies used head-fixed recordings, where it may have been easier to isolate movement versus reward-related responses. Second, those studies observed prominent phasic responses to the delivery or omission of expected rewards - responses largely absent in the current paper. This absence suggests a possibility that neurons exhibiting such phasic "reward" responses were not sampled, which is plausible since in both primates and rodents, these neurons tend to be located in restricted topographic regions. Alternatively, in the head-fixed recordings, kinematic/spatial coding may have gone undetected due to the forced immobility.

      Overall, this paper offers needed insight into how the basal ganglia output encodes behavior. The EPN recordings from freely moving mice clearly demonstrate that individual neurons integrate reward, kinematic, and spatiotemporal features, challenging traditional models. However, the specific relationship between spatiotemporal activity and factors like action value remains unclear.

    2. eLife assessment

      This valuable study reports on electrophysiological recording of the spiking activity of single neurons in the entopeduncular nucleus (EPN) in freely-moving mice performing an auditory discrimination task. The data show that the activity of single EPN neurons is modulated by reward and movement kinematics, with the latter further affected by task contexts (e.g. movement toward or away from a reward location). The results provide solid evidence for the conclusions. Reviewer enthusiasm was reduced by the lack of investigations separating confounding factors and ambiguity as to whether the data contain the population of EPN neurons characterized in previous studies that obtained different results. The work will be of interest to those that study how the basal ganglia contribute to behavior, or the mechanisms of learning and/or movement more broadly.

    3. Reviewer #1 (Public Review):

      The authors in this paper investigate the nature of the activity in the rodent EPN during a simple freely moving cue-reward association task. Given that primate literature suggests movement coding whereas other primate and rodent studies suggest mainly reward outcome coding in the EPNs, it is important to try to tease apart the two views. Through careful analysis of behavior kinematics, position, and neural activity in the EPNs, the authors reveal an interesting and complex relationship between the EPN and mouse behavior.

      Strengths:

      (1) The authors use a novel freely moving task to study EPN activity, which displays rich movement trajectories and kinematics. Given that previous studies have mostly looked at reward coding during head-fixed behavior, this study adds a valuable dataset to the literature.

      (2) The neural analysis is rich and thorough. Both single neuron level and population level (i.e. PCA) analysis are employed to reveal what EPN encodes.

      Weaknesses:

      (1) One major weakness in this paper is the way the authors define the EPN neurons. Without a clear method of delineating EPN vs other surrounding regions, it is not convincing enough to call these neurons EPNs solely from looking at the electrode cannula track from Figure 2B. Indeed, EPN is a very small nucleus and previous studies like Stephenson-Jones et al (2016) have used opto-tagging of Vglut2 neurons to precisely label EPN single neurons. Wallace et al (2017) have also shown the existence of SOM and PV-positive neurons in the EPN. By not using transgenic lines and cell-type specific approaches to label these EPN neurons, the authors miss the opportunity to claim that the neurons recorded in this study do indeed come from EPN. The authors should at least consider showing an analysis of neurons slightly above or below EPN and show that these neurons display different waveforms or firing patterns.

      (2) The authors fail to replicate the main finding about EPN neurons which is that they encode outcome in a negative manner. Both Stephenson-Jones et al (2016) and Hong and Hikosaka (2008) show a reward response during the outcome period where firing goes down during reward and up during neutral or aversive outcome. However, Figure 2 G top panel shows that the mean population is higher during correct trials and lower during incorrect trials. This could be interesting given that the authors might try recording from another part of EPN that has not been studied before. However, without convincing evidence that the neurons recorded are from EPN in the first place (point 1), it is hard to interpret these results and reconcile them with previous studies.

      3) The authors say that: 'reward and kinematic doing are not mutually exclusive, challenging the notion of distinct pathways and movement processing'. However, it is not clear whether the data presented in this work supports this statement. First, the authors have not attempted to record from the entire EPN. Thus it is possible that the coding might be more segregated in other parts of EPN. Second, EPNs have previously been shown to display positive firing for negative outcomes and vice versa, something which the authors do not find here. It is possible that those neurons might not encode kinematic and movement variables. Thus, the authors should point out in the main text the possibility that the EPN activity recorded might be missing some parts of the whole EPN.

      4). The authors use an IR beam system to record licks and make a strong claim about the nature of lick encoding in the EPN. However, the authors should note that IR beam system is not the most accurate way of detecting licks given that any object blocking the path (paw or jaw-dropping) will be detected as lick events. Capacitance based, closed-loop detection, or video capturing is better suited to detect individual licks. Given that the authors are interested in kinematics of licking, this is important. The authors should either point this out in the main text or verify in the system if the IR beam is correctly detecting licks using a combination of those methods.

    1. eLife assessment

      The aim of this important study is to functionally characterize neuronal circuits underlying the escape behavior in Drosophila larvae. Upon detection of a noxious stimulus, larvae follow a series of stereotyped movements that include bending of their body, rolling and crawling away. This paper combines quantitative behavioral analyses, cell-type specific manipulations, optogenetics, calcium imaging, immunostaining, and connectomic analysis to provide convincing evidence of an inhibitory descending pathway that controls the switch from rolling to fast crawling behaviors of the larval escape response.

    2. Reviewer #1 (Public Review):

      Summary:

      Zhu et al. set out to better understand the neural mechanisms underlying Drosophila larval escape behavior. The escape behavior comprises several sequenced movements, including a lateral roll motion followed by fast crawling. The authors specifically were looking to identify neurons important for the roll-to-crawl transition.

      Strengths:

      This paper is clearly written, and the experiments are logical and complementary. They support the author's main claim that SeIN128 is a type of descending neuron that is both necessary and sufficient to modulate the termination of rolling. In general, the rigor is high.

      Weaknesses:

      -This manuscript is narrowly focused on Drosophila larval escape behavior. It would be more accessible to a broader audience if this work were put into a larger context of descending control.

    3. Reviewer #2 (Public Review):

      Summary:

      The authors have addressed the majority of my comments, and I believe the revised manuscript has improved significantly.

      The escape behavior of Drosophila larvae includes rolling followed by fast crawling, but the neural mechanism of this sequence was unclear. The authors determined the function of SeIN128, a group of descending neurons that terminate rolling and shorten crawling latency. SeIN128 receives inputs from Basin-2 and A00c neurons, which facilitate rolling, and makes reciprocal inhibitory synapses onto Basin-2 and A00c. SeIN128 shows a delayed activity peak upon Basins or A00c stimulation. Gad staining indicates that SeIN128 neurons are GABAergic, and blocking of SeIN128 function caused increased rolling probability and prolonged rolling. RNAi knockdown of GABA receptors in Basins suggests that several GABA receptors, especially GABA-A-R, mediate the SeIN128 to Basins inhibition. Among Basins subtypes, both Basin-2 and Basin-4 facilitate rolling but SeIN128 specifically terminates rolling elicited by Basin-2 activation. Overall, SeIN128 forms a feedback inhibition ensemble with Basin-2 and A00c that terminates rolling and shifts the animal to crawling.

      Overall, this study discovered a neural mechanism that serves as a switch from rolling to fast crawling behaviors in Drosophila larvae. It addressed important open questions of how neural circuits determine the sequence of locomotor behaviors and how animals switch from one behavior to another. Its results support the conclusions and are backed up with proper control experiments.

      Strengths:

      - The question (i.e., the neural circuitry of action selection) addressed by this study is important.<br /> - Larval and adult Drosophila is a powerful model system in neuroscience study, with rich genetic tools, diverse behaviors, and well-studied nervous systems. This study makes good use of them.<br /> - The experiments, analyses, and results are rigorous and support the major claims. This study combined multiple innovative approaches, such as automated, machine-learning-based behavioral assays, EM reconstruction of larval CNS neurons, and genetic manipulation of specific neurons. A wide range of control experiments enhanced the credibility of the results.<br /> - The graphical representations are clear and mindfully arranged.

      Weaknesses:

      I believe "Corkscrew-like rolling" is not an accurate term for larval rolling. The neuromuscular basis of rolling was recently studied by Cooney et. al., showing that rolling is the circumferential propagation of muscle activity where all segments contract similarly and synchronously. So using another term instead of "Corkscrew-like rolling" may help.

    4. Reviewer #3 (Public Review):

      Summary:

      Combining the behavioral assays with optogenetics, imaging, and connectome approaches, this meticulous study characterizes the underlying neuronal mechanisms of escape behavior in Drosophila larvae. The authors identify the neurons and provide convincing evidence to support their function in the roll-to-crawl locomotor transition.

      Strengths:

      It is a very thorough characterization of locomotor sequences in terms of underlying neural circuits. The findings shed light on investigating the analogous behaviors in other systems.

      Weaknesses:

      None. The authors have revised the article to improve the presentation and clarity.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Recommendations For The Authors):

      In my opinion, the three most important controls (hopefully easy):

      (1) Include no ATR controls for optogenetic activation experiments (not all, just one or two, e.g., Figure 4B, C, or D, for the highest activation condition). The concern is that it can be quite hard to use light to both monitor neural responses while also using light to activate the function of other neurons.

      We thank the reviewer for the suggestions. We use a 2-photon 910-nm laser (which does not activate Chrimson) for imaging of GCaMP and a 624-nm LED (which does not activate GFP) for Chrimson activation. Calcium (GCaMP) signals are detected by PMT during Chrimson activation. With this setup, we are able to image GCaMP signals without crosstalk during activation of Chrimson.

      We performed calcium imaging in animals that were not fed ATR and found that SS04185 showed no response to LED stimulation at the strongest intensity (µW/mm) (New Figure 4 – figure supplement 1B).

      (2) Demonstrate that their RNAi constructs do indeed knock down the intended target gene. They showed nicely in Figure 5A that SeIN128 expresses GABA. Presumably, these neurons also express VGAT. Is it possible to check the expression of VGAT after RNAi knockdown? The concern is that using only a single RNAi introduces the possibility of off-target effects. Using multiple RNAi lines for VGAT or other parts of the pathway would also alleviate this (minor concern).

      We thank the reviewer for raising this point. We agree that using only one RNAi line (HMS02355) for VGAT in Figure 5A is a weakness. 

      Accordingly, we have performed additional experiments to quantify the effect of RNAi knockdown of VGAT using HMS02335 in all neurons, followed by subsequent immunostaining against GABA or VGAT. We found that both VGAT and GABA were significantly reduced in the neuropil (Figure 5 – figure supplement 1C and D). These data strongly suggest that HMS02355 knocks down VGAT and reduces GABA at axon terminals. We note that HMS02355 has been used previously for knocking down GABA signaling in the following studies.

      (1) Kallman BR, Kim H, Scott K (2015). Excitation and inhibition onto central courtship neurons biases Drosophila mate choice. eLife 4:e11188. https://doi.org/10.7554/eLife.11188

      (2) Zhao W, Zhou P, Gong C et al. (2019). A disinhibitory mechanism biases Drosophila innate light preference. Nat Commun 10, 124. https://doi.org/10.1038/s41467-018-07929-w

      (3) Yamagata N, Ezaki T, Takahashi T, Wu H, Tanimoto H (2021). Presynaptic inhibition of dopamine neurons controls optimistic bias. eLife 10:e64907. https://doi.org/10.7554/eLife.6490

      (3) Include genetic controls for their driver line.

      In Figure 1, it would be nice to see one half or the other half of their split GAL4 line in their manipulations. The concern is that perhaps the phenotype is coming from something unexpected in the genetic background.

      We thank the reviewer for the suggestion. We have added half of the GAL4 lines (AD or DBD) as controls (New Figure 1 – figure supplement 2). We found that SS04185 showed reduction of rolling, whereas AD only or DBD only (split control) did not (half of the split lines). 

      In the discussion:

      It seems that activation of SS014185 has additional effects beyond what the authors have quantified. Specifically, larvae do not appear to re-initiate rolling in the same manner as Basin activation alone. Also, there appears to be an off-response, turning.

      We appreciate the reviewer’s comments. We have included a section in the discussion to consider the differences patterns of rolling observed during joint stimulation of Basins and SS04185 and during stimulation of Basins alone, as well as the increase in turning following the offset of joint stimulation of Basins and SS04185 compared with stimulation of Basins alone (lines 464 to 481). Although the reasons for these differences are beyond the scope of the paper, we have added Figure 2 – figure supplement 1K, which shows that co-activation of SS04185-MB and Basins is sufficient to evoke turning following the offset of stimulation, suggesting that the increased turning may be due to the activation of SS04185-MB neurons and independent of SS04185-DN neurons.  

      The labeling of the Figure panels could be improved. In many places, it is not clear that Basins are being stimulated in the background, whereas in nearby panels, it is clearly labeled. This is confusing for the reader.

      We thank the reviewer for the constructive suggestions. We have modified all relevant figures to read “Basins>Chrimson” above the pink line indicating the period of optogenetic activation.

      Reviewer #2 (Recommendations For The Authors):

      Claims, rigorousness, repeatability, and accuracy of terms.

      (1) In line 254, the authors suggest that the slow response of SeIN128 neurons is due to the input they receive from SEZ, but in line 453, they suggest it is due to axo-axonal connections. However, their evidence does not support one factor over the other. Overall, only the axo-axonal connection was strongly suggested in the discussion. The authors could clarify that the delay of SeIN128 activity may also be caused by multisynaptic connections involving SEZ or other neurons in the last section of the Discussion.

      Although SeIN128 primarily receives inputs from the SEZ, it also receives inputs within the VNC from Basin-2 (Figure 4 – figure supplement 2). Specifically, in the VNC, the axons of SeIN128 make inhibitory synaptic contacts onto the axon of Basin-2, which in turn makes reciprocal excitatory contacts onto the axon of SeIN128, thereby forming a feedback loop. However, by the time we wrote the original discussion, we had inadvertently focused on the potential of the negative feedback loop formed by these axo-axonal synapses in the VNC to mediate the slow response of SeIN128, overlooking the possibility that other as yet unidentified pathways could convey Basin or A00c activity indirectly to SeIN128 dendrites in the SEZ. Therefore, we have revised the original text, which read “These data suggest that the main synaptic inputs onto SeIN128 neurons in the SEZ mediate the slow responses upon activation of Basins or A00c neurons” to “These data suggest that the delay of SeIN128 activity may be caused by multi-synaptic connections involving the SEZ or a feedback loop involving axo-axonal connections between SeIN128 and Basin-2 or A00c” (revised, Lines 259 and 261). Accordingly, we have also adjusted the relevant discussion section to be consistent with this change (Lines 460 and 466).

      (2) Please clarify the following: How does the algorithm define rolling and crawling? Healthy larvae complete 360{degree sign} rolls, in each roll they rotate from dorsal up to dorsal up. It is possible that a larva rolls for an incomplete cycle and straightens up. Does the algorithm simply label individual frames as “roll”, “non-roll”, or “unknown”, and defines rolling by the existence of “roll” frames? If so, then larvae that rolled for 90{degree sign} and straightened would be counted as “rolling” though they failed to complete a full rolling bout. Also, how were “hunch” “turn” and “back” identified? Lastly, is there any manual quality control involved? Address this and related issues in the methods:

      a)  Expand the description of the classifier algorithm.

      b)  How are rolling and non-rolling animals defined in the "rolling%" assay? Were all "rolling" animals able to do at least one 360{degree sign} roll?

      c)  How are "rolling duration" and "end of 1st rolling" defined? Is the algorithm able to distinguish different rolling bouts? In these two assays, were the animals rolled for <1 second (in total or their "first roll") able to complete a 360{degree sign} roll?

      The Multi-worm Tracker (MWT) records only the contours of animals (no real video image data). Thus, the data fed into the classifier algorithm only includes features based on contour time-series data. The algorism uses movement perpendicular to the body axis—the characteristic feature of larval rolling—to classify rollers and non-rollers. Although the algorithm cannot determine whether a rolling event involves a rotation of more than 360 degrees, we ensure that rolling events are at least 360 degrees by removing any events that are shorter than 0.2 s (the minimum time to complete a 360-degree roll).

      We have accordingly revised the section of “Behavior detection” relating to the behavior classification algorithm in the methods section as follows (Lines 600 to 620).

      “After extracting behavioral parameters from Choreography, we used an unsupervised machine learning behavior classification algorithm to detect and quantify the following behaviors: hunching (Hunch), headbending (Turn), stopping (Stop), and peristaltic crawling (Crawl) as previously reported (Masson et al., 2020). Escape rolling (Roll) was detected with a classifier developed using the Janelia Automatic Animal Behavior Annotator (JAABA) platform (Kabra et al., 2013; Ohyama et al., 2015). JAABA transforms the MWT tracking data into a collection of ‘per-frame’ behavioral parameters and regenerates 2D dorsal-view videos of the tracked larvae. Based on such videos, we defined rolling as a rotation around the body while the larva maintains a C-shape, which results in a movement perpendicular to larval body axis (Supplementary videos 1 and 2). Using this definition, we trained the algorithm in the JAABA platform by labeling ~10,000 randomly chosen frames as rolling or non-rolling to develop the rolling classifier. If a larva did not curl into a C-shape or move sideways, it was labeled as a “non-roller.” Every animal with at least one rolling event longer than 0.2 s in a given period was labeled as a “roller” (i.e., it was assumed to have rolled at least 360 degrees), based on the observation that when the start and end of rolling events were precisely measured, the algorithm could identify rolling events completed in 0.2 s.

      The rejection of false positives, especially at the beginning and the end of each rolling bout, enhanced accuracy. The algorithm integrated these training labels and parameters generated with Choreography in a time series, such as speed, crabspeed, and body curvature, to generate a score for rolling detection. Above a certain threshold, the classifier labeled the frame as rolling. This classifier, which has false negative and false positive rates of 7.4% and 7.8%, respectively (n = 102), was utilized to detect rolling in this paper.”

      Readability of text

      (1) I suggest giving the SS04185 line and SeIN128 neuron common names that are easier to remember and follow (after mentioning their full name once).

      We acknowledge the reviewer’s concerns. However, because SS04185 was initially named using the Janelia split-line pipeline, and SeIN128 was named independently in a more recent study (Ohyama et al., 2015), we have retained these designations in the present manuscript.

      Figures and figure legends

      (1) It would help if the authors could put visual representations of rolling and crawling, such as a cartoon larva performing the rolling-crawling switch, and still frames of rolling and crawling of real larvae, especially in Figure 1. Also, please consider including a video of rolling and crawling in real larvae (preferably comparing control and experimental groups).

      We appreciate the reviewer’s suggestion. We have added a cartoon of the behavioral sequence in Figure 1A, as well as a Figure 1 supplement video based on MWT data, which shows rolling followed by crawling. 

      (2) To give the reader a take-home message, it would help if the authors could make a simplified version of Figure 4A and put it at the end of the paper.

      We thank the reviewer for the suggestion. To assist the reader, we have added schematics depicting how the circuit may function in panel I of Figure 8.

      (3) In Figure 1A, add the text "activation " after the neuron names.

      We have added “Chrimson” following “Basins>” to the new Figure 1B (old Figure 1A) and other figures (Figure 1C and D, Figure 5A, Figure 6A, and figure supplements).

      (4) Figure 1G: a data point is misaligned (at the top of the graph). 

      We have aligned the data point accordingly.

      (5) Figure 1B can benefit from a better design. If possible, please separate the crawling speed into an independent graph (or at least use a different line shape to code for crawling speed and indicate it on the in-graph legend). Is the speed of Basin/SS04185 co-activation studied?

      We appreciate the reviewer’s suggestion. We have separated the plots for rolling and crawling speed into different panels (Figure 1C and D). As shown in Figure 1D, the crawling speed observed during coactivation of Basins and SS04185 was similar to that during activation of Basins alone.

      (6) Figure S1 uses a different color-coding scheme from Figure 1. I suggest making the color coding consistent between figures.

      We are grateful for the reviewer’s suggestion. We have adjusted the color-coding scheme accordingly.

      (7) Line 692 (Figure 2 legend), "Killer Zipper" is misspelled as "Kipper Zipper". Out of curiosity, is there a way to remove or reduce SS04185-DN expression in the same manner as SS04185-MB reduction?

      We have corrected the text in the legend for Figure 2. As for the reviewer’s question, we did attempt to reduce or abolish SS04185-DN expression with tsh-LexA and LexAop-Kip+ but found no effect. Other identified LexA constructs with SeIN128 expression, however, all showed SS04185-MB expression. Consequently, we could not use these constructs because they inhibit both SeIN128 and SS04185-DN.

      (8) The color coding of Figure 2 (especially in D) makes it hard to distinguish between the brown and red groups.

      We thank the reviewer for the suggestion. Accordingly, we have changed the color for the brown group to orange.

      (9) In line 926 (Figure S2 legends), the description of F and G seems inverted.

      We appreciate the reviewer for pointing out the error. We have revised the text from “(F) has only SS04185-

      MB expression, and (G) has both SS04185-DN and SS04185-MB expression” to “(F) has both SS04185DN and SS04185-MB expression, and (G) has only SS04185-MB expression.”

      (10) Figure 7B: which line does the top group of asterisks belong to?

      The top group of asterisks indicates that each experimental group differs significantly (p < 0.001) from the control group. We have revised the figure to clarify the comparisons indicated by the asterisks in Figure 7B, as well as the figure legend below (Line 890-894).

      “(B) Cumulative plot of rolling duration. Statistics: Kruskal-Wallis test: H = 69.52, p < 0.001; Bonferronicorrected Mann-Whitney test, p < 0.001 between control and the GABA-B-R11, GABA-B-R12 and GABAB-R2 RNAi groups, p < 0.001 between GABA-A-R and all other experimental RNAi group. Sample size for the colored bars from top (control, black) to bottom (GABA-A-R, red); n = 520, 488, 387, 582, 306.”

      (11) Figure S8 D and F: indicate Basin-2 or Basin-4 activation on graph.

      We have revised Figure 8 – figure supplement D and F accordingly.

      Reviewer #3 (Recommendations For The Authors):

      (1) Lines 86-87: Text needs to be rewritten for clarity. Also, include the genotype in the corresponding figure legend (Figure 1B).

      We thank the reviewer for pointing this out. We have clarified the text accordingly and included the genotype in the figure legend (lines 86 and 87). Specifically, we have revised Figure 1B (New Figure 1C and D) and adjusted the legend accordingly as follows. 

      Lines 86 and 87: Crawling speed during the activation of all Basins following rolling was ~1.5 times that of the crawling speed at baseline (Figure 1D).

      (2) Include the protocol for heat shock-FLP out experiments

      We have added the following paragraph to the Methods section describing the heat shock-FlpOut experiments (lines 537 to 546).

      “Heat shock FlpOut mosaic expression

      First instar Drosophila larvae were exposed to heat shock in a water bath at 37°C for 12 min as previously described (Nern et al., 2015). With precise temporal and thermal control of heat shock, larvae with genotype

      w+, hs(KDRT.stop)FLP/13xLexAop2-IVS-CsChrimson::tdTomato; R54B01-Gal4.AD/72F11LexA;20xUAS-(FRT.stop)-CsChrimson::mVenus/R46E07-Gal4.DBD showed sporadic

      CsChrimson::mVenus expression driven by SS04185 split GAL4. As a result, the ratio of the larvae with SS04185-DN and SS04185-MB expression to those with only SS04185-MB expression was 1:1. Each larva was individually examined with optogenetic stimulation and behavior analysis. After behavioral experiments, mVenus expression in CNS was confirmed under the fluorescence microscope.”

      (3) In the immunohistochemistry, the authors exclude the steps for washings. Recommend the authors to cite the previous literature. Similar to the other protocols detailed in the methods.

      We have added a brief description of the steps involved in washing (lines 641 and 648). We have also provided a citation with similar immunohistology protocols (Patel, 1994).

      (4) Keeping the same Y-axis scale for similar graphical representation would be helpful to compare across different experimental conditions and genotypes-for example, 2E and 2H for the start of the first crawl.

      As suggested by the reviewer, we have adjusted the y-axis scales for Figure 2E and H to be identical.

      (5) The color schematics used for the graph make it hard to visualize the data. The author might reconsider the better presentation of the data by avoiding darker colors.

      We thank the reviewer for the constructive suggestion. We have lightened the shading of all violin plots. We have also modified the shading for the middle group in Figure 2C and E from dark brown to orange.

      (6) Co-activation of the SS04185 and Basins in the figures represented as Basins+SS04185 (Figure 1A) and SS04185 (rest of the figures). Authors might reconsider this terminology to define and distinguish the coactivation of SS04185 and Basins neurons from the activation of SS04185 or Basins alone. It needs to be clarified in the figures.

      We have adjusted the terminology by including “Basins>Chrimson” in all panels in which Basin neurons are optogenetically activated to trigger rolling in the background for all groups. Additionally, we have labeled the control group as “Control” and the experimental group as ”SS04185”. 

      (7) Figure 4A, summarizes the synaptic connection and strength between different neurons - SeIN128, Basins, A00c and mdIV. However, the nature of these synaptic connections - excitatory and inhibitory- is not represented. Based on the previous and current studies, the authors consider providing the schematic for circuit mechanisms of escape behavior sequences in larvae. Also, discussing these findings in light of the downstream output circuit and motor regulation might be informative (See Cooney et al. 2023, PNAS).

      As the reviewer correctly points out, the diagram of the connectome shown in Figure 4A does not indicate whether the connections are excitatory or inhibitory. Accordingly, we have added a new summary panel (Figure 8I) based on the results of examining GABAergic synapses (Figure 5A). The schematics in Figure 8I depict how the joint activity of inhibitory and excitatory synapses (indicated by arrowheads and blunt ends, respectively) may lead to rolling or fast crawling.

      We have also added a section discussing the premotor circuits for crawling and rolling premotor circuit in discussion (Line 512 – 519).

      (8) Percentage rolling present in figure 5B and 6A correspond to the control larvae 13xLexAop2-IVS-CsChrimson::mVenus; R72F11-lexA/+; HMS02355/+ and 13xLexAop2-IVS- Cs-Chrimson::mVenus; R72F11-lexA/+; UAS-TeTxLC.tnt/+. How does the author interpret the observed variability across the experiments? The author might consider discussing the genetic background effect on the observed behaviors, if any.

      As pointed out by the reviewer, we noticed that rolling probability varied depending on genetic background. We have revised the text accordingly (Lines 277 to 280).

      (9) Recheck the arrowheads in Figure 5A.

      We have confirmed the positions of the arrowheads in Figure 5A and modified the figures by outlining the cells with dotted lines.

      (10) Lines 295-298: Data presented in the supplementary figure and p-values in the text (p=0.11) suggest that the first crawl's onset is comparable to controls. Rewrite this text for clarity and include the statistical values in the supplemental figure 6.

      We have revised the text as follows (Lines 302 to 305).

      “Although the duration of each rolling bout, time to onset of the first rolling bout, and time to onset of the first crawling bout did not differ from those of controls (Figure 6–figure supplement 1D, E and G), the time to offset of the first rolling bout was delayed relative to controls (p = 0.013 for Figure 6–figure supplement 1F).”

      (11) Lines 263-264: Data provide evidence for SS04185 receiving inputs Basin-2 and A00c neurons. SS04185, which provides inputs to other neurons, specifically A00c neurons, but still needs clarification.

      We have revised the text as follows (Lines 264 to 266).

      The results thus far indicate that, activation of SeIN128 neurons inhibits rolling (Figure 1A–C), SeIN128 neurons receive functional inputs from Basin-2 and A00c (Figure 4A-C); and SeIN128 neurons make anatomical connections onto Basin-2 and A00c (Figure 4A). 

      (12) In the table that lists the genotypes, instead of '-' or the blank space in the label column, the author might consider using 'control,' consistent with the figures.

      In accord with the reviewer’s suggestion, we have revised the notation of ‘-’ or the blank space, to ‘control’ for all figures.

      (13) Check the typographical errors throughout the manuscript. Some below:

      We have revised the text accordingly as suggested below.

      a.  Lines 100, 142: SS4185 should be SS04185

      b.  Line 230: A00C should be A00c

      c.  Line 180: Expand VNC

      d.  10xUAS-IVS-mry::GFP should be 10xUAS-IVS-myr::GFP

      e.  Lines 444, 449: drosophila should be Drosophila

    1. eLife assessment

      This important paper shows that the anti-gremlin-1 (GREM1) antibody is not effective at treating liver inflammation or fibrosis. Critically, the evidence also challenges existing data on the detection of GREM1 by ELISA in serum or plasma by demonstrating that high-affinity binding of GREM1 to heparin would lead to localisation of GREM1 in the ECM or at the plasma membrane of cells. The conclusions are supported by a convincing, well-controlled set of experiments.

    2. Reviewer #1 (Public Review):

      Summary:

      Horn and colleagues present data suggesting that the targeting of GREM1 has little impact on a mouse model of metabolic dysfunction-associated steatohepatitis. Importantly, they also challenge existing data on the detection of GREM1 by ELISA in serum or plasma by demonstrating that high-affinity binding of GREM1 to heparin would lead to localisation of GREM1 in the ECM or at the plasma membrane of cells.

      Strengths:

      This is an impressive tour-de-force study around the potential of targeting GREM1 in MASH.

      This paper will challenge many existing papers in the field around our ability to detect GREM1 in circulation, at least using antibody-mediated detection.

      Well-controlled, detailed studies like this are critically important in order to challenge less vigorous studies in the literature.

      The impressive volume of high-level, well-controlled data using an impressive range of in vitro biochemical techniques, rodent models, and human liver slices.

      Weaknesses: only minor.

      (1) The authors clearly show that heparin can limit the diffusion of GREM1 into the circulation-however, in a setting where GREM1 is produced in excess (e.g. cancer), could this "saturate" the available heparin and allow GREM1 to "escape" into the circulation?

      (2) Secondly, has the author considered that GREM1 be circulating bound to a chaperone protein like albumin which would reduce its reactivity with GREM1 detection antibodies?

      (3) Statistics-there is no mention of blinding of samples-I assume this was done prior to analysis?

      (4) Line 211-I suggest adding the Figure reference at the end of this sentence to direct the reader to the relevant data.

      (5) Figure 1E Y-axis units are a little hard to interpret-can integers be used?

      (6) Did the authors attempt to detect GREM1 protein by IHC? There are published methods for this using the R&D Systems mouse antibody (PMID 31384391).

      (7) Did the authors ever observe GREM1 internalisation using their Atto-532 labelled GREM1?

      (8) Did the authors complete GREM1 ISH in the rat CDAA-HFD model? Was GREM1 upregulated, and if so, where?

      (9) Supplementary Figure 4C - why does the GFP level decrease in the GREM1 transgenic compared to control the GFP mouse? No such change is observed in Supplementary Figure 4E.

    3. Reviewer #2 (Public Review):

      It is controversial whether liver gremlin-1 expression correlates with liver fibrosis in metabolic dysfunction-associated steatohepatitis (MASH). Horn et al. developed an anti-Gremlin-1 antibody in-house and tested its ability to neutralize gremlin-1 and treat liver fibrosis. This article has the advantage of testing its hypothesis with different animal and human liver fibrosis models and using a variety of research methodologies.

      The experimental design and results support the conclusion that the anti-gremlin-1 antibody had no therapeutic effect on treating liver fibrosis, so there are no other suggestions for new experiments:

      (1) The authors used RNAscope in situ hybridization to establish the correlation between Gremlin-1 expression and NMSH livers or cell lines.

      (2) A luminescent oxygen channelling immunoassay was used to measure circulating Gremlin-1 concentration. They found that Gremlin-1 binds to heparin very efficiently, preventing Gremlin-1 from entering circulation, and restricting Gremlin-1's ability to mediate organ cross-communication.

      (3) The authors developed a suitable NMSH rat model which is a choline-deficient, L-amino acid defined high fat 1% cholesterol diet (CDAA-HFD) fed rat model of NMSH, and created a selective anti-Gremlin-1 antibody which is heparin-displacing 0030:HD antibody. They also used human cirrhotic precision-cut liver slices to test their hypotheses. They demonstrated that neutralization of Gremlin-1 activity with monoclonal therapeutic antibodies does not reduce liver inflammation or liver fibrosis.

      One concern is that several reagents and assays are made in-house without external validation. Also, will those in-house reagents and assays be available to the science community?

      Overall this manuscript provides useful information that gremlin-1 has a limited role in liver fibrosis pathogenesis and treatment.

    4. Author response:

      Reviewer #1 (Public Review):

      Summary:

      Horn and colleagues present data suggesting that the targeting of GREM1 has little impact on a mouse model of metabolic dysfunction-associated steatohepatitis. Importantly, they also challenge existing data on the detection of GREM1 by ELISA in serum or plasma by demonstrating that high-affinity binding of GREM1 to heparin would lead to localisation of GREM1 in the ECM or at the plasma membrane of cells.

      Strengths:

      This is an impressive tour-de-force study around the potential of targeting GREM1 in MASH.

      This paper will challenge many existing papers in the field around our ability to detect GREM1 in circulation, at least using antibody-mediated detection.

      Well-controlled, detailed studies like this are critically important in order to challenge less vigorous studies in the literature.

      The impressive volume of high-level, well-controlled data using an impressive range of in vitro biochemical techniques, rodent models, and human liver slices.

      We thank the reviewer for their time in assessing our manuscript and are very grateful for the positive response. Below, we give a point-by-point response to the reviewer’s comments and indicate where we plan to adjust the manuscript.

      Weaknesses: only minor.

      (1) The authors clearly show that heparin can limit the diffusion of GREM1 into the circulation-however, in a setting where GREM1 is produced in excess (e.g. cancer), could this "saturate" the available heparin and allow GREM1 to "escape" into the circulation?

      We thank the reviewer for their question. Indeed theoretically, if the production of Gremlin-1 exceeds the capacity of heparin to immobilise Gremlin-1, the protein may be released into solution and thus may enter the circulation. Whilst we have not addressed this possibility in our studies, we agree that it may be a mechanism worthwhile exploring in future studies.

      (2) Secondly, has the author considered that GREM1 be circulating bound to a chaperone protein like albumin which would reduce its reactivity with GREM1 detection antibodies?

      We have thought of the possibility that Gremlin would bind other proteins such as BMPs, and thereby mask assay-antibody epitopes. To minimise this possibility, we used antibody pairs which bind different epitopes. We also used LC-MS for Gremlin-1 detection (data not shown in the manuscript), a method that is not affected by epitope masking. With the LC-MS analysis we did not pick up any gremlin-signal in plasma. We will mention the LC-MS data in the updated manuscript.

      Also, we were able to detect circulating Gremlin-1 after treatment with anti-Gremlin-1 antibodies. As these were the same antibodies that were used in our assays, we should have not been able to detect Gremlin-1 if there had been a masking interaction with circulating high abundant plasma proteins such as albumin.

      Finally, we believe that the assay antibodies would outcompete binding of any other proteins because of their high affinity and very high concentrations used in the assays.

      In summary, we are very confident that Gremlin-1 is not present in circulation. We will though make some minor adjustments to the manuscript in order to stress this important point.

      (3) Statistics-there is no mention of blinding of samples-I assume this was done prior to analysis?

      All reported results were derived from hard quantitative readouts obtained through assays that are not liable to subjective interpretation. This also applies to immunohistochemistry and RNAscope histologic quantification, using Visiopharm Integrator System software ver. 8.4 or HALO v3.5.3577 (Area Quantification v2.4.2 module), respectively. Therefore, no blinding was necessary prior to analysis.

      (4) Line 211-I suggest adding the Figure reference at the end of this sentence to direct the reader to the relevant data.

      We thank the reviewer for the suggestion and will add a reference to Figure 1F here.

      (5) Figure 1E Y-axis units are a little hard to interpret-can integers be used?

      As the y axis in Figure 1E is on the logarithmic scale, integer numbers would be very hard to read because of the large range of numbers. As we acknowledge that the notation used may be difficult to read, we will change it to superscript scientific notation.

      (6) Did the authors attempt to detect GREM1 protein by IHC? There are published methods for this using the R&D Systems mouse antibody (PMID 31384391).

      Parallel to the work described in PMID 31384391 (Dutton et al., Oncotarget, 10: 4630-4639, 2019), we have tested a whole range of commercial and in-house gremlin-1 antibodies. We independently arrived at the same conclusion as Dutton et al namely that goat anti-gremlin antibody R&D Systems AF956 can stain the mouse or rat intestine in the muscularis layer and in the crypts/lower part of the villi, using FFPE sections. As per Dutton et al. we also corroborated this IHC staining by RNAscope - the mRNA was restricted to the muscularis and the connective tissue just below the crypts, suggesting that Gremlin-1 partially diffuses away from the cells that produce it. In contrast, none of the other commercial or in-house gremlin antibodies that we tested provided any useful staining on FFPE sections.

      We also used the R&D Systems AF956 antibody on several rat MASH liver samples. We saw little or no staining in livers from chow-fed rats, with only occasional weak staining around portal areas. Depending on the rat model, we saw from little or no staining to at most weak staining in portal areas and fibrotic areas. Among the various models tested, we observed the strongest staining in the rat CDAA-HFD+cholesterol model, in line with the ISH data.

      However, we were unable to establish IHC on human MASH liver samples using the R&D Systems AF956 antibody (or any other antibody) despite 98% sequence identity at the amino acid level between human and rat gremlin-1. Considering the results in Dutton et al. on rodent intestines, we tested the antibody on some human intestine samples, but the results on the available samples (inflamed appendices) were inconclusive.

      We will include representative IHC staining images for Gremlin-1 protein on rat livers as a Supplementary Figure and mention in the manuscript that IHC for human Gremlin-1 did not work with the available antibodies.

      (7) Did the authors ever observe GREM1 internalisation using their Atto-532 labelled GREM1?

      The Atto-532 Gremlin-1 cell association assay was mainly intended to visualise the association of Gremlin-1 with cell surface proteoglycans and how this interaction is affected by heparin-displacing and non-displacing antibodies. We observed a possible, but inconclusive intracellular association of Atto-532 Gremlin-1. However, this assay was not specifically designed for this purpose, and we did not follow up on this. Therefore, we cannot draw any conclusions on whether cell surface bound Gremlin-1 can be internalised. However, we appreciate that internalisation of Gremlin-1 would be an interesting biological mechanism worth following up in future studies.

      (8) Did the authors complete GREM1 ISH in the rat CDAA-HFD model? Was GREM1 upregulated, and if so, where?

      We have performed Grem1 ISH in the rat CDAA-HFD model and representative images of this are shown in Figure 1F. In chow-fed animals, Grem1 was expressed in a few cells in the portal tract, whereas after CDAA-HFD, Grem1 positive cells became more abundant in the portal tract and were also detectable in the fibrotic septa, as described in the respective results section. However, we performed no co-staining with other markers as we did for human liver samples.

      (9) Supplementary Figure 4C - why does the GFP level decrease in the GREM1 transgenic compared to control the GFP mouse? No such change is observed in Supplementary Figure 4E.

      In Supplementary Figure 4C we show expression of GFP mRNA and GREM1 mRNA in lysates of GFP-control and GREM1-GFP overexpressing LX-2 cells. The x-axis labels indicate the different lentiviruses. Therefore, the right panel in Supplementary Figure 4C shows that GREM1 overexpressing LX-2 cells expressed more GREM1 compared to GFP-control transduced LX-2, while GFP mRNA expression was comparable between the two.

      The results in Supplementary Figure 4E look different because – as can also be seen from the % of GFP+ cells in Supplementary Figure 4D – the GREM1 lentivirus here was more effective in transducing the cells, which is why both GFP and GREM1 mRNA were increased with GREM1 lentivirus compared to the GFP-only control. Unlike LX-2, the lentivirally transduced HHSC were not sorted on GFP positive cells prior to qPCR, which may explain the differences in GFP mRNA expression pattern between the two cell types.

      We acknowledge that the figure may be difficult to interpret and will adjust the figure annotation to improve on this.

      Reviewer #2 (Public Review):

      It is controversial whether liver gremlin-1 expression correlates with liver fibrosis in metabolic dysfunction-associated steatohepatitis (MASH). Horn et al. developed an anti-Gremlin-1 antibody in-house and tested its ability to neutralize gremlin-1 and treat liver fibrosis. This article has the advantage of testing its hypothesis with different animal and human liver fibrosis models and using a variety of research methodologies.

      The experimental design and results support the conclusion that the anti-gremlin-1 antibody had no therapeutic effect on treating liver fibrosis, so there are no other suggestions for new experiments:

      (1) The authors used RNAscope in situ hybridization to establish the correlation between Gremlin-1 expression and NMSH livers or cell lines.

      (2) A luminescent oxygen channelling immunoassay was used to measure circulating Gremlin-1 concentration. They found that Gremlin-1 binds to heparin very efficiently, preventing Gremlin-1 from entering circulation, and restricting Gremlin-1's ability to mediate organ cross-communication.

      (3) The authors developed a suitable NMSH rat model which is a choline-deficient, L-amino acid defined high fat 1% cholesterol diet (CDAA-HFD) fed rat model of NMSH, and created a selective anti-Gremlin-1 antibody which is heparin-displacing 0030:HD antibody. They also used human cirrhotic precision-cut liver slices to test their hypotheses. They demonstrated that neutralization of Gremlin-1 activity with monoclonal therapeutic antibodies does not reduce liver inflammation or liver fibrosis.

      One concern is that several reagents and assays are made in-house without external validation. Also, will those in-house reagents and assays be available to the science community?

      Overall this manuscript provides useful information that gremlin-1 has a limited role in liver fibrosis pathogenesis and treatment.

      We thank the reviewer for their time in assessing our manuscript and are very grateful for the positive response. We acknowledge the fact that most of our results were derived from assays using in-house generated reagents which will therefore be hard to reproduce externally. Whilst for legal reasons we cannot share the sequences of the monoclonal antibodies, we will be able to share aliquots with fellow scientists upon request. We will include a sentence to this end to the data availability statement.

    1. eLife assessment

      The study by Asabuki et al. is a valuable contribution to understanding how cortical neural networks encode internal models into spontaneous activity. It uses a recurrent network of spiking neurons subject to predictive learning principles and provides a novel mechanism to learn the spontaneous replay of probabilistic sensory experiences. While promising in its ability to explain spontaneous network dynamics, the manuscript is incomplete in terms of the strength of support for its main findings. The difference of the proposed sampling dynamics from Markovian types of sampling is unclear and the use of non-negative synaptic strengths is applied in a non-biological manner.

    2. Reviewer #1 (Public Review):

      In their manuscript, the authors propose a learning scheme to enable spiking neurons to learn the appearance probability of inputs to the network. To this end, the neurons rely on error-based plasticity rules for feedforward and recurrent connections. The authors show that this enables the networks to spontaneously sample assembly activations according to the occurrence probability of the input patterns they respond to. They also show that the learning scheme could explain biases in decision-making, as observed in monkey experiments. While the task of neural sampling has been solved before in other models, the novelty here is the proposal that the main drivers of sampling are within-assembly connections, and not between-assembly (Markov chains) connections as in previous models. This could provide a new understanding of how spontaneous activity in the cortex is shaped by synaptic plasticity.

      The manuscript is well written and the results are presented in a clear and understandable way. The main results are convincing, concerning the spontaneous firing rate dependence of assemblies on input probability, as well as the replication of biases in the decision-making experiment. Nevertheless, the manuscript and model leave open several important questions. The main problem is the unclarity, both in theory and intuitively, of how the sampling exactly works. This also makes it difficult to assess the claims of novelty the authors make, as it is not clear how their work relates to previous models of neural sampling.

      Regarding the unclarity of the sampling mechanism, the authors state that within-assembly excitatory connections are responsible for activating the neurons according to stimulus probability. However, the intuition for this process is not made clear anywhere in the manuscript. How do the recurrent connections lead to the observed effect of sampling? How exactly do assemblies form from feedforward plasticity? This intuitive unclarity is accompanied by a lack of formal justification for the plasticity rules. The authors refer to a previous publication from the same lab, but it is difficult to connect these previous results and derivations to the current manuscript. The manuscript should include a clear derivation of the learning rules, as well as an (ideally formal) intuition of how this leads to the sampling dynamics in the simulation.

      Some of the model details should furthermore be cleared up. First, recurrent connections transmit signals instantaneously, which is implausible. Is this required, would the network dynamics change significantly if, e.g., excitation arrives slightly delayed? Second, why is the homeostasis on h required for replay? The authors show that without it the probabilities of sampling are not matched, but it is not clear why, nor how homeostasis prevents this. Third, G and M have the same plasticity rule except for G being confined to positive values, but there is no formal justification given for this quite unusual rule. The authors should clearly justify (ideally formally) the introduction of these inhibitory weights G, which is also where the manuscript deviates from their previous 2020 work. My feeling is that inhibitory weights have to be constrained in the current model because they have a different goal (decorrelation, not prediction) and thus should operate with a completely different plasticity mechanism. The current manuscript doesn't address this, as there is no overall formal justification for the learning algorithm.

      Finally, the authors should make the relation to previous models of sampling and error-based plasticity more clear. Since there is no formal derivation of the sampling dynamics, it is difficult to assess how they differ exactly from previous (Markov-based) approaches, which should be made more precise. Especially, it would be important to have concrete (ideally experimentally testable) predictions on how these two ideas differ. As a side note, especially in the introduction (line 90), this unclarity about the sampling made it difficult to understand the contrast to Markovian transition models.

      There are also several related models that have not been mentioned and should be discussed. In 663 ff. the authors discuss the contributions of their model which they claim are novel, but in Kappel et al (STDP Installs in Winner-Take-All Circuits an Online Approximation to Hidden Markov Model Learning) similar elements seem to exist as well, and the difference should be clarified. There is also a range of other models with lateral inhibition that make use of error-based plasticity (most recently reviewed in Mikulasch et al, Where is the error? Hierarchical predictive coding through dendritic error computation), and it should be discussed how the proposed model differs from these.

    3. Reviewer #2 (Public Review):

      Summary:

      The paper considers a recurrent network with neurons driven by external input. During the external stimulation predictive synaptic plasticity adapts the forward and recurrent weights. It is shown that after the presentation of constant stimuli, the network spontaneously samples the states imposed by these stimuli. The probability of sampling stimulus x^(i) is proportional to the relative frequency of presenting stimulus x^(i) among all stimuli i=1,..., 5.

      Methods:

      Neuronal dynamics:

      For the main simulation (Figure 3), the network had 500 neurons, and 5 non-overlapping stimuli with each activating 100 different neurons where presented. The voltage u of the neurons is driven by the forward weights W via input rates x, the inhibitory recurrent weights G, are restricted to have non-negative weights (Dale's law), and the other recurrent weights M had no sign-restrictions. Neurons were spiking with an instantaneous Poisson firing rate, and each spike-triggered an exponentially decaying postsynaptic voltage deflection. Neglecting time constants of the postsynaptic responses, the expected postsynaptic voltage reads (in vectorial form) as

      u = W x + (M - G) f (Eq. 5)

      where f =; phi(u) represents the instantaneous Poisson rate, and phi a sigmoidal nonlinearity. The rate f is only an approximation (symbolized by =;) of phi(u) since an additional regularization variable h enters (taken up in Point 4 below). The initialisation of W and M is Gaussian with mean 0 and variance 1/sqrt(N), N the number of neurons in the network. The initial entries of G are all set to 1/sqrt(N).

      Predictive synaptic plasticity:

      The 3 types of synapses were each adapted so that they individually predict the postsynaptic firing rate f, in matrix form

      ΔW ≈ (f - phi( W x ) ) x^T<br /> ΔM ≈ (f - phi( M f ) ) f^T<br /> ΔG ≈ (f - phi( M f ) ) f^T but confined to non-negative values of G (Dale's law).

      The ^T tells us to take the transpose, and the ≈ again refers to the fact that the ϕ entering in the learning rule is not exactly the ϕ determining the rate, only up to the regularization (see Point 4).

      Main formal result:

      As the authors explain, the forward weight W and the unconstrained weight M develop such that, in expectations,

      f =; phi( W x ) =; phi( M f ) =; phi( G f ) ,

      consistent with the above plasticity rules. Some elements of M remain negative. In this final state, the network displays the behaviour as explained in the summary.

      Major issues:

      Point 1: Conceptual inconsistency

      The main results seem to arise from unilaterally applying Dale's law only to the inhibitory recurrent synapses G, but not to the excitatory recurrent synapses M.

      In fact, if the same non-negativity restriction were also imposed on M (as it is on G), then their learning rules would become identical, likely leading to M=G. But in this case, the network becomes purely forward, u = W x, and no spontaneous recall would arise. Of course, this should be checked in simulations.

      Because Dale's law was only applied to G, however, M and G cannot become equal, and the remaining differences seem to cause the effect.

      Predictive learning rules are certainly powerful, and it is reasonable to consider the same type of error-correcting predictive learning rule, for instance for different dendritic branches that both should predict the somatic activity. Or one may postulate the same type of error-correcting predictive plasticity for inhibitory and excitatory synapses, but then the presynaptic neurons should not be identical, as it is assumed here. Both these types of error-correcting and error-forming learning rules for same-branches and inhibitory/excitatory inputs have been considered already (but with inhibitory input being itself restricted to local input, for instance).

      Point 2: Main result as an artefact of an inconsistently applied Dale's law?

      The main result shows that the probability of a spontaneous recall for the 5 non-overlapping stimuli is proportional to the relative time the stimulus was presented. This is roughly explained as follows: each stimulus pushes the activity from 0 up towards f =; phi( W x ) by the learning rule (roughly). Because the mean weights W are initialized to 0, a stimulus that is presented longer will have more time to push W up so that positive firing rates are reached (assuming x is non-negative). The recurrent weights M learn to reproduce these firing rates too, while the plasticity in G tries to prevent that (by its negative sign, but with the restriction to non-negative values). Stimuli that are presented more often, on average, will have more time to reach the positive target and hence will form a stronger and wider attractor. In spontaneous recall, the size of the attractor reflects the time of the stimulus presentation. This mechanism so far is fine, but the only problem is that it is based on restricting G, but not M, to non-negative values.

      Point 3: Comparison of rates between stimulation and recall.

      The firing rates with external stimulations will be considerably larger than during replay (unless the rates are saturated).

      This is a prediction that should be tested in simulations. In fact, since the voltage roughly reads as<br /> u = W x + (M - G) f,<br /> and the learning rules are such that eventually M =; G, the recurrences roughly cancel and the voltage is mainly driven by the external input x. In the state of spontaneous activity without external drive, one has<br /> u = (M - G) f ,<br /> and this should generate considerably smaller instantaneous rates f =; phi(u) than in the case of the feedforward drive (unless f is in both cases at the upper or lower ceiling of phi). This is a prediction that can also be tested.

      Because the figures mostly show activity ratios or normalized activities, it was not possible for me to check this hypothesis with the current figures. So please show non-normalized activities for comparing stimulation and recall for the same patterns.

      Point 4: Unclear definition of the variable h.<br /> The formal definition of h = hi is given by (suppressing here the neuron index i and the h-index of tau)

      tau dh/dt = -h if h>u, (Eq. 10)<br /> h = u otherwise.

      But if it is only Equation 10 (nothing else is said), h will always become equal to u, or will vanish, i.e. either h=u or h=0 after some initial transient. In fact, as soon as h>u, h is decaying to 0 according to the first line. If u is >0, then it stops at u=h according to the second line. No reason to change h=u further. If u<=0 while h>u, then h is converging to 0 according to the first line and will stay there. I guess the authors had issues with the recurrent spiking simulations and tried to fix this with some regularization. However as presented, it does not become clear how their regulation works.

      BTW: In Eq. 11 the authors set the gain beta to beta = beta0/h which could become infinite and, putatively more problematic, negative, depending on the value of h. Maybe some remark would convince a reader that no issues emerge from this.

      Added from discussions with the editor and the other reviewers:

      Thanks for alerting me to this Supplementary Figure 8. Yes, it looks like the authors did apply there Dale's law for both the excitatory and inhibitory synapses. Yet, they also introduced two types of inhibitory pathways converging both to the excitatory and inhibitory neurons. For me, this is a confirmation that applying Dale's law to both excitatory and inhibitory synapses, with identical learning rules as explained in the main part of the paper, does not work.

      Adding such two pathways is a strong change from the original model as introduced before, and based on which all the Figures in the main text are based. Supplementary Figure 8 should come with an analysis of why a single inhibitory pathway does not work. I guess I gave the reason in my Points 1-3. Some form of symmetry breaking between the recurrent excitation and recurrent inhibition is required so that, eventually, the recurrent excitatory connection will dominate.

      Making the inhibitory plasticity less expressive by applying Dale's law to only those inhibitory synapses seems to be the answer chosen in the Figures of the main text (but then the criticism of unilaterally applying Dale's law).

      Applying Dale's law to both types of synapses, but dividing the labor of inhibition into two strictly separate and asymmetric pathways, and hence asymmetric development of excitatory and inhibitory weights, seems to be another option. However, introducing such two separate inhibitory pathways, just to rescue the fact that Dale's law is applied to both types of synapses, is a bold assumption. Is there some biological evidence of such two pathways in the inhibitory, but not the excitatory connections? And what is the computational reasoning to have such a separation, apart from some form of symmetry breaking between excitation and inhibition? I guess, simpler solutions could be found, for instance by breaking the symmetry between the plasticity rules for the excitatory and inhibitory neurons. All these questions, in my view, need to be addressed to give some insights into why the simulations do work.

      Overall, Supplementary Figure 8 seems to me too important to be deferred to the Supplement. The reasoning behind the two inhibitory pathways should appear more prominently in the main text. Without this, important questions remain. For instance, when thinking in a rate-based framework, the two inhibitory pathways twice try to explain the somatic firing rate away. Doesn't this lead to a too strong inhibition? Can some steady state with a positive firing rate caused by the recurrence, in the absence of an external drive, be proven? The argument must include the separation into Path 1 and Path 2. So far, this reasoning has not been entered.

      In fact, it might be that, in a spiking implementation, some sparse spikes will survive. I wonder whether at least some of these spikes survive because of the other rescuing construction with the dynamic variable h (Equation 10, which is not transparent, and that is not taken up in the reasoning either, see my Point 4).

      Perhaps it is helpful for the authors to add this text in the reply to them.

    4. Reviewer #3 (Public Review):

      Summary:

      The work shows how learned assembly structure and its influence on replay during spontaneous activity can reflect the statistics of stimulus input. In particular, stimuli that are more frequent during training elicit stronger wiring and more frequent activation during replay. Past works (Litwin-Kumar and Doiron, 2014; Zenke et al., 2015) have not addressed this specific question, as classic homeostatic mechanisms forced activity to be similar across all assemblies. Here, the authors use a dynamic gain and threshold mechanism to circumnavigate this issue and link this mechanism to cellular monitoring of membrane potential history.

      Strengths:

      (1) This is an interesting advance, and the authors link this to experimental work in sensory learning in environments with non-uniform stimulus probabilities.

      (2) The authors consider their mechanism in a variety of models of increasing complexity (simple stimuli, complex stimuli; ignoring Dale's law, incorporating Dale's law).

      (3) Links a cellular mechanism of internal gain control (their variable h) to assembly formation and the non-uniformity of spontaneous replay activity. Offers a promise of relating cellular and synaptic plasticity mechanisms under a common goal of assembly formation.

      Weaknesses:

      (1) However, while the manuscript does show that assembly wiring does follow stimulus likelihood, it is not clear how the assembly-specific statistics of h reflect these likelihoods. I find this to be a key issue.

      (2) The authors' model does take advantage of the sigmoidal transfer function, and after learning an assembly is either fully active or nearly fully silent (Figure 2a). This somewhat artificial saturation may be the reason that classic homeostasis is not required since runaway activity is not as damaging to network activity.

      (3) Classic mechanisms of homeostatic regulation (synaptic scaling, inhibitory plasticity) try to ensure that firing rates match a target rate (on average). If the target rate is the same for all neurons then having elevated firing rates for one assembly compared to others during spontaneous activity would be difficult. If these homeostatic mechanisms were incorporated, how would they permit the elevated firing rates for assemblies that represent more likely stimuli?

    1. eLife assessment

      This study presents a valuable methodological advancement in quantifying thoughts over time. A novel multi-dimensional experience-sampling approach is used to identify data-driven patterns that the authors use to interrogate fMRI data collected during naturalistic movie-watching. The experimentation is inventive and the analyses carried out are convincing, although the conceptualization of thoughts remains too vague to allow for a clear interpretation of results.

    2. Reviewer #1 (Public Review):

      Summary:

      The authors used a novel multi-dimensional experience sampling (mDES) approach to identify data-driven patterns of experience samples that they use to interrogate fMRI data collected during naturalistic movie-watching data. They identify a set of multi-sensory features of a set of movies that delineate low-dimensional gradients of BOLD fMRI signal patterns that have previously been linked to fundamental axes of cortical organization.

      Strengths:

      The novel solution to challenges associated with experience sampling offers potential access to aspects of experience that have been challenging to assess. While inventive, I worry that the reliability of the mDES approach is currently under-investigated, making it challenging to interpret the import of the later analyses, which are themselves strong and compelling.

      Weaknesses:

      The lack of direct interrogation of individual differences/reliability of the mDES scores warrants some pause.

    3. Reviewer #2 (Public Review):

      Summary:

      The present study explores how thoughts map onto brain activity, a notoriously challenging question because of the dynamic, subjective, and abstract nature of thoughts. To tackle this question, the authors collected continuous thought ratings from participants watching a movie, and additionally made use of an open-source fMRI dataset recorded during movie watching as well as five established gradients of brain variation as identified in resting state data. Using a voxel-space approach, the results show that episodic knowledge, verbal detail, and sensory engagement of thoughts commonly modulate the activation of the visual and auditory cortex, while intrusive distraction modulates the frontoparietal network. Additionally, sensory engagement is mapped onto a gradient from the primary to the association cortex, while episodic knowledge is mapped onto a gradient from the dorsal attention network to the visual cortex. Building on the association between behavioral performance and neural activation, the authors conclude that sensory coupling to external input and frontoparietal executive control is key to comprehension in naturalistic settings.

      The manuscript stands out for its methodological advancements in quantifying thoughts over time and its aim to study the implementation of thoughts in the brain during naturalistic movie watching. However, the conceptualization of thoughts remains vague, its distinction from other concepts like attention is unclear, and interindividual differences are not sufficiently addressed, limiting the study's insights into brain function.

      Strengths:

      (1) The study raises a question that has been difficult to study in naturalistic settings so far but is key to understanding human cognition, namely how thoughts map onto brain activation.

      (2) The thought ratings introduce a novel method for continuously tracking thoughts, promising utility beyond this study.

      (3) The authors substantiated the effects of thinking from multiple perspectives, using diverse data types, metrics, and analyses.

      (4) The figures are highly informative, accessible, and consistent, aiding comprehension.

      Weaknesses:

      (1) The dimensions of thought seem to distinguish between sensory and executive processing states. However, it is unclear if this effect primarily pertains to thinking. I could imagine highly intrusive distractions in movie segments to correlate with stagnating plot development, little change in scenery, or incomprehensible events. Put differently, it may primarily be the properties of the movies that evoke different processing modes, but these properties are not accounted for. For example, I'm wondering whether a simple measure of engagement with stimulus materials could explain the effects just as much. How can the effects of thinking be distinguished from the perceptual and semantic properties of the movie, as well as attentional effects? Is the measure used here capturing thought processes beyond what other factors could explain?

      (2) I'm skeptical about taking human thought ratings at face value. Intrusive distraction might imply disengagement from stimulus materials, but it could also be an intended effect of the movie to trigger higher-level, abstract thinking. Can a label like intrusive distraction be misleading without considering the actual thought and movie content?

      (3) A jittered sampling approach is used to acquire thought ratings every 15 seconds. Are ratings for the same time point averaged across participants? If so, how consistent are ratings among participants? High consistency would suggest thoughts are mainly stimulus-evoked. Low consistency would question the validity of applying ratings from one (group of) participant(s) to brain-related analyses of another participant.

      (4) Using three different movies to conclude that different genres evoke different thought patterns (e.g., line 277) seems like an overinterpretation with only one instance per genre.

      (5) I see no indication that results were cross-validated, and no effect sizes are reported, leaving the robustness and strength of effects unknown.

    4. Reviewer #3 (Public Review):

      This study attempted to investigate the relationship between processing in the human brain during movie watching and corresponding thought processes. This is a highly interesting question, as movie watching presents a semi-constrained task, combining naturally occurring thoughts and common processing of sensory inputs across participants. This task is inherently difficult because in order to know what participants are thinking at any given moment, one has to interrupt the same thought process which is the object of study.

      This study attempts to deal with this issue by aggregating staggered experience sampling data across participants in one behavioral study and using the population-level thought patterns to model brain activity in different participants in an open-access fMRI dataset.

      The behavioral data consist of 120 participants who watched 3 11-minute movie clips. Participants responded to the mDES questionnaire: 16 visual scales characterizing ongoing thought 5 times, two minutes apart, in each clip. The 16 items are first reduced to 4 factors using PCA, and their levels are compared across the different movies. The factors are "episodic knowledge", "intrusive distraction", "verbal detail", and "sensory engagement". The factors differ between the clips, and distraction is negatively correlated with movie comprehension, and sensory engagement is positively correlated with comprehension.

      The components are aggregated across participants (transforming single-subject mDES answers into PCA space and concatenating responses of different participants), and are used as regressors in a GLM analysis. This analysis identifies brain regions corresponding to the components. The resulting brain maps reveal activations that are consistent with the proposed mental processes (e.g. negative loading for intrusion in the frontoparietal network, and positive loadings for visual and auditory cortices for sensory engagement).

      Then, the coordinates for brain regions that were significant for more than one component are entered into a paper search in neurosynth. It is not clear what this analysis demonstrates beyond the fact that sensory engagement contains both visual and auditory components.

      The next analysis projected group-averaged brain activation onto gradients (based on previous work) and used gradient timecourses to predict the behavioral report timecourses. This revealed that high activations in gradient 1 (sensory→association) predicted high sensory engagement, and that "episodic knowledge" thought patterns were predicted by increased visual cortex activations. Then, permutation tests were performed to see whether these thought pattern-related activations corresponded to well-defined regions on a given cluster.

      This paper is framed as presenting a new paradigm but it does little to discuss what this paradigm serves, what its limitations are, and how it should have been tested. I assume that the novelty is in using experience sampling from 1 sample to model the responses of a second sample.

      What are the considerations for treating high-order thought patterns that occur during film viewing as stable enough to be used across participants? What would be the limitations of this method? (Do all people reading this paper think comparable thoughts reading through the sections?)

      How does this approach differ from collaborative filtering, (for example as presented in Chang et al., 2021)?

      In conclusion, this study tackles a highly interesting subject and does it creatively and expertly. It fails to discuss and establish the utility and appropriateness of its proposed method.

      Luke J. Chang et al. ,Endogenous variation in ventromedial prefrontal cortex state dynamics during naturalistic viewing reflects affective experience.Sci. Adv.7,eabf7129(2021).DOI:10.1126/sciadv.abf7129

    1. eLife assessment

      This important study provides new insights into the mechanisms that underlie perceptual and attentional impairments of conscious access. The paper presents convincing evidence of a dissociation between the early stages of low-level perception, which are impermeable to perceptual or attentional impairments, and subsequent stages of visual integration which are susceptible to perceptual impairment but resilient to attentional manipulations. This study will be of interest to scientists working on visual perception and consciousness.

    2. Reviewer #1 (Public Review):

      Summary:

      In this work, Noorman and colleagues test the predictions of the "four-stage model" of consciousness by combining psychophysics and scalp EEG in humans. The study relies on an elegant experimental design to investigate the respective impact of attentional and perceptual blindness on visual processing.

      The study is very well summarised, the text is clear and the methods seem sound. Overall, a very solid piece of work. I haven't identified any major weaknesses. Below I raise a few questions of interpretation that may possibly be the subject of a revision of the text.

      (1) The perceptual performance on Fig1D appears to show huge variation across participants, with some participants at chance levels and others with performance > 90% in the attentional blink and/or masked conditions. This seems to reveal that the procedure to match performance across participants was not very successful. Could this impact the results? The authors highlight the fact that they did not resort to post-selection or exclusion of participants, but at the same time do not discuss this equally important point.

      (2) In the analysis on collinearity and illusion-specific processing, the authors conclude that the absence of a significant effect of training set demonstrates collinearity-only processing. I don't think that this conclusion is warranted: as the illusory and non-illusory share the same shape, so more elaborate object processing could also be occuring. Please discuss.

      (3) Discussion, lines 426-429: It is stated that the results align with the notion that processes of perceptual segmentation and organization represent the mechanism of conscious experience. My interpretation of the results is that they show the contrary: for the same visibility level in the attentional blind or masking conditions, these processes can be implicated or not, which suggests a role during unconscious processing instead.

      (4). The two paradigms developed here could be used jointly to highlight non-idiosyncratic NCCs, i.e. EEG markers of visibility or confidence that generalise regardless of the method used. Have the authors attempted to train the classifier on one method and apply it to another (e.g. AB to masking and vice versa)? What perceptual level is assumed to transfer?

      (5). How can the results be integrated with the attentional literature showing that attentional filters can be applied early in the processing hierarchy?

    3. Reviewer #2 (Public Review):

      Summary:

      This is a very elegant and important EEG study that unifies within a single set of behaviorally equated experimental conditions conscious access (and therefore also conscious access failures) during visual masking and attentional blink (AB) paradigms in humans. By a systematic and clever use of multivariate pattern classifiers across conditions, they could dissect, confirm, and extend a key distinction (initially framed within the GNWT framework) between 'subliminal' and 'pre-conscious' unconscious levels of processing. In particular, the authors could provide strong evidence to distinguish here within the same paradigm these two levels of unconscious processing that precede conscious access : (i) an early (< 80ms) bottom-up and local (in brain) stage of perceptual processing ('local contrast processing') that was preserved in both unconscious conditions, (ii) a later stage and more integrated processing (200-250ms) that was impaired by masking but preserved during AB. On the basis of preexisting studies and theoretical arguments, they suggest that this later stage could correspond to lateral and local recurrent feedback processes. Then, the late conscious access stage appeared as a P3b-like event.

      Strengths:

      The methodology and analyses are strong and valid. This work adds an important piece in the current scientific debate about levels of unconscious processing and specificities of conscious access in relation to feed-forward, lateral, and late brain-scale top-down recurrent processing.

      Weaknesses:

      - The authors could improve clarity of the rich set of decoding analyses across conditions.<br /> - They could also enrich their Introduction and Discussion sections by taking into account the importance of conscious influences on some unconscious cognitive processes (revision of traditional concept of 'automaticity'), that may introduce some complexity in Results interpretation<br /> - They should discuss the rich literature reporting high-level unconscious processing in masking paradigms (culminating in semantic processing of digits, words or even small group of words, and pictures) in the light of their proposal (deeper unconscious processing during AB than during masking).

    4. Reviewer #3 (Public Review):

      Summary:

      This work aims to investigate how perceptual and attentional processes affect conscious access in humans. By using multivariate decoding analysis of electroencephalography (EEG) data, the authors explored the neural temporal dynamics of visual processing across different levels of complexity (local contrast, collinearity, and illusory perception). This is achieved by comparing the decidability of an illusory percept in matched conditions of perceptual (i.e., degrading the strength of sensory input using visual masking) and attentional impairment (i.e., impairing top-down attention using attentional blink, AB). The decoding results reveal three distinct temporal responses associated with the three levels of visual processing. Interestingly, the early stage of local contrast processing remains unaffected by both masking and AB. However, the later stage of collinearity and illusory percept processing are impaired by the perceptual manipulation but remain unaffected by the attentional manipulation. These findings contribute to the understanding of the unique neural dynamics of perceptual and attentional functions and how they interact with the different stages of conscious access.

      Strengths:

      The study investigates perceptual and attentional impairments across multiple levels of visual processing in a single experiment. Local contrast, collinearity, and illusory perception were manipulated using different configurations of the same visual stimuli. This clever design allows for the investigation of different levels of visual processing under similar low-level conditions.

      Moreover, behavioural performance was matched between perceptual and attentional manipulations. One of the main problems when comparing perceptual and attentional manipulations on conscious access is that they tend to impact performance at different levels, with perceptual manipulations like masking producing larger effects. The study utilizes a staircasing procedure to find the optimal contrast of the mask stimuli to produce a performance impairment to the illusory perception comparable to the attentional condition, both in terms of perceptual performance (i.e., indicating whether the target contained the Kanizsa illusion) and metacognition (i.e., confidence in the response).

      The results show a clear dissociation between the three levels of visual processing in terms of temporal dynamics. Local contrast was represented at an early stage (~80 ms), while collinearity and illusory perception were associated with later stages (~200-250 ms). Furthermore, the results provide clear evidence in support of a dissociation between the effects of perceptual and attentional processes on conscious access: while the former affected both neuronal correlates of collinearity and illusory perception, the latter did not have any effect on the processing of the more complex visual features involved in the illusion perception.

      Weaknesses:

      The design of the study and the results presented are very similar to those in Fahrenfort et al. (2017), reducing its novelty. Similar to the current study, Fahrenfort et al. (2017) tested the idea that if both masking and AB impact perceptual integration, they should affect the neural markers of perceptual integration in a similar way. They found that behavioural performance (hit/false alarm rate) was affected by both masking and AB, even though only the latter was significant in the unmasked condition. An early classification peak was instead only affected by masking. However, a late classification peak showed a pattern similar to the behavioural results, with classification affected by both masking and AB.

      The interpretation of the results mainly centres on the theoretical framework of the recurrent processing theory of consciousness (Lamme, 2020), which lead to the assumption that local contrast, collinearity, and the illusory perception reflect feedforward, local recurrent, and global recurrent connections, respectively. It should be mentioned, however, that this theoretical prediction is not directly tested in the study. Moreover, the evidence for the dissociation between illusion and collinearity in terms of lateral and feedback connections seems at least limited. For instance, Kok et al. (2016) found that, whereas bottom-up stimulation activated all cortical layers, feedback activity induced by illusory figures led to a selective activation of the deep layers. Lee & Nguyen (2001), instead, found that V1 neurons respond to illusory contours of the Kanizsa figures, particularly in the superficial layers. They all mention feedback connections, but none seem to point to lateral connections.

      Moreover, the evidence in favour of primarily lateral connections driving collinearity seems mixed as well. On one hand, Liang et al. (2017) showed that feedback and lateral connections closely interact to mediate image grouping and segmentation. On the other hand, Stettler et al. (2002) showed that, whereas the intrinsic connections link similarly oriented domains in V1, V2 to V1 feedback displays no such specificity. Furthermore, the other studies mentioned in the manuscript did not investigate feedback connections but only lateral ones, making it difficult to draw any clear conclusions.

    1. eLife assessment

      This fundamental state-of-the-art modeling study explores neural mechanisms underlying walking control in cats, demonstrating the probability of three different states of operation of the spinal cord circuits generating locomotion at different speeds. The biophysical modeling sufficiently reproduces and provides explanations for experimental data on how the locomotor cycle and phase durations depend on treadmill walking speed. It also points to new principles of functional architecture and operating regimes underlying how spinal circuits interact with supraspinal signals and limb sensory feedback signals to produce different locomotor behaviors at different speeds, which are major unresolved problems in the field. The modeling evidence is compelling, especially in advancing our understanding of locomotion control mechanisms, and will interest neuroscientists studying the neural control of movement.

    2. Reviewer #1 (Public Review):

      Summary:

      It is suggested that for each limb the RG (rhythm generator) can operate in three different regimes: a non-oscillating state-machine regime, and in a flexordriven and a classical half-center oscillatory regime. This means that the field can move away from the old concept that there is only room for the classic half-center organization

      Strengths:

      A major benefit of the present paper is that a bridge was made between various CPG concepts ( "a potential contradiction between the classical half-center and flexor-driven concepts of spinal RG operation"). Another important step forward is the proposal about the neural control of slow gait ("at slow speeds ({less than or equal to} 0.35 m/s), the spinal network operates in a state regime and requires external inputs for phase transitions, which can come from limb sensory feedback and/or volitional inputs (e.g. from the motor cortex").

      Weaknesses:

      Some references are missing.

    3. Reviewer #2 (Public Review):

      Summary:

      The biologically realistic model of the locomotor circuits developed by this group continues to define the state of the art for understanding spinal genesis of locomotion. Here the authors have achieved a new level of analysis of this model to generate surprising and potentially transformative new insights. They show that these circuits can operate in three very distinct states and that, in the intact cord, these states come into successive operation as the speed of locomotion increases. Equally important, they show that in spinal injury the model is "stuck" in the low speed "state machine" behavior.

      Strengths:

      There are many strengths for the simulation results presented here. The model itself has been closely tuned to match a huge range of experimental data and this has a high degree of plausibility. The novel insight presented here, with the three different states, constitutes a truly major advance in the understanding of neural genesis of locomotion in spinal circuits. The authors systematically consider how the states of the model relate to presently available data from animal studies. Equally important, they provide a number of intriguing and testable predictions. It is likely that these insights are the most important achieved in the past 10 years. It is highly likely proposed multi-state behavior will have a transformative effect on this field.

      Weaknesses:

      I have no major weaknesses. A moderate concern is that the authors should consider some basic sensitivity analyses to determine if the 3 state behavior is especially sensitive to any of the major circuit parameters - e.g. connection strengths in the oscillators or?

    4. Reviewer #3 (Public Review):

      Summary:

      This work probes the control of walking in cats at different speeds and different states (split-belt and regular treadmill walking). Since the time of Sherrington there has been ongoing debate on this issue. The authors provide modeling data showing that they could reproduce data from cats walking on a specialized treadmill allowing for regular and split-belt walking. The data suggest that a non-oscillating state-machine regime best explains slow walking - where phase transitions are handled by external inputs into the spinal network. They then show at higher speeds a flexor-driven and then a classical half-center regime dominates. In spinal animals, it appears that a non-oscillating state-machine regime best explains the experimental data. The model is adapted from their previous work, and raises interesting questions regarding the operation of spinal networks, that, at low speeds, challenge assumptions regarding central pattern generator function. This is an interesting study. I have a few issues with the general validity of the treadmill data at low speeds, which I suspect can be clarified by the authors.

      Strengths:

      The study has several strengths. Firstly the detailed model has been well established by the authors and provides details that relate to experimental data such as commissural interneurons (V0c and V0d), along with V3 and V2a interneuron data. Sensory input along with descending drive is also modelled and moreover the model reproduces many experimental data findings. Moreover, the idea that sensory feedback is more crucial at lower speeds, also is confirmed by presynaptic inhibition increasing with descending drive. The inclusion of experimental data from split-belt treadmills, and the ability of the model to reproduce findings here is a definite plus.

      Weaknesses:

      Conceptually, this is a very useful study which provides interesting modeling data regarding the idea that the network can operate in different regimes, especially at lower speeds. The modelling data speaks for itself, but on the other hand, sensory feedback also provides generalized excitation of neurons which in turn project to the CPG. That is they are not considered part of the CPG proper. In these scenarios, it is possible that an appropriate excitatory drive could be provided to the network itself to move it beyond the state-machine state - into an oscillatory state. Did the authors consider that possibility? This is important since work using L-DOPA, for example, in cats or pharmacological activation of isolated spinal cord circuits, shows the CPG capable of producing locomotion without sensory or descending input.

    1. eLife assessment

      This study is an important advancement towards the understanding of animal nervous system organization and evolution by providing a compelling description of the entire connectome of the 3-day larva of the marine annelid Platynereis dumerilii. It provides a wealth of data on cell type diversity and the modules that interconnect them. Its strength in the massive amount of high-quality data is also partly a weakness as it can make it difficult to read and scientifically digest. This work lays the foundations for studies on cell type diversity, segmental vs. intersegmental connectivity, and mushroom bodies, but will certainly also be of use to scientists interested in other nervous systems parts, their functions, and evolution.

    2. Reviewer #1 (Public Review):

      Summary:

      This paper provides a resource for researchers studying the marine annelid Platynereis dumerilii. It is only the third whole-body connectome to be assembled and thus provides a comparison with those less complex animals: the nematode Caenorhabditis elegans and the tunicate Ciona intestinialis. The paper catalogs all cells in the body, not just neurons, and details how sensory neurons, interneurons, motor neurons, and effector organs are connected. From this, the authors are able to extract information about the organization of different aspects of the nervous system. These include the extent of recurrent connectivity, unimodal and multimodal sensory processing, and long-range and short-range connectivity.

      Several interesting conclusions are drawn, including the concept that circuit evolution might have proceeded by duplication and diversion of cell types, much as it has been posited that gene evolution has occurred. It also informs the understanding of the evolution of segmental body plans in annelids by mapping and comparing cells in each segment.

      Strengths:

      This paper contains a wealth of data. The raw dataset is available. The codes and scripts are provided to allow interested readers to utilize this dataset.

      The analysis is painstakingly meticulous. The diagrams are organized to orient the reader to the complexities of this overwhelming analysis

      Weaknesses:

      The strength of the paper is also its weakness. It contains so much data and analysis that it is burdensome to read and understand. There are 16 multi-panel data figures in the main text, and \another 38 supplemental figures, and 5 videos.

      The impact of the paper is diminished by its size and depth. The paper could be broken up into smaller thematic papers that would be more accessible to researchers interested in particular topics. For example, there could be a single paper on the mushroom body and another paper on the segmental organization.

    3. Reviewer #2 (Public Review):

      Summary:

      The stated ambition of the authors in this manuscript is to thoroughly analyze the complete neural connectome of the three-day larva of the marine annelid Platynereis. This manuscript follows several previous publications by the same group on the same volume of serial EM data, addressing several specialized functional circuits, and supersedes a previous preprint published in 2020. To this end, the authors have annotated the whole cell complement of the larva, including non-neural cells, with the collaborative tool CATMAID, traced the whole neurite extensions of neural cells, and annotated all synapses. The connectome has been algorithmically analyzed to extract the principal modules, adding several new, so far unexplored neural circuits to the list.

      Strengths:

      This remarkable study adds a third species to the list of animals in which the full connectome and functional modules have been analyzed, alongside C. elegans and Ciona intestinalis. It represents a leap in phylogeny, with Platynereis being a representative of the lophotrochozoans. Also, Platynereis has considerably more neurons than the latter species. The study provides a complete picture of the set of neural modules that are necessary for the survival of an autonomous marine larva with an active lifestyle.

      The analysis is particularly impressive for revealing the complete innervation of the entire set of effector cells in the Platynereis larva, including muscle fibers, glands, pigment cells, ciliated cells, and helping understand the overall control of the organism's behavior through multiple sensory pathway integrations. It also reveals layers of neuronal intercalation in sensory-effector pathways that allow further integration even in a larva with limited behavioral complexity. The structure of the developing mushroom bodies, proposed ancestral bilaterian brain sensory integrative units, is detailed, as well as a complex mechanosensory module specific to a swimming larva.

      A key new aspect of this connectome study is the thorough analysis of segmental cell types and intersegmental connectivity. Metameric organization is widespread in bilaterians and is nowhere clearer than in annelids. This metameric organization is even proposed by some authors to be an ancestral trait of bilaterians. Here, the authors show that homologous cell types and connectivity are shared not only by all segments of the animal but also by its non-segmental terminal parts (anterior prostomium and posterior pygidium). They suggest, in turn, that the entire body of the annelid may be formed of ancestral metameric units, an idea proposed before but here strongly supported by a list of homologous cell types. This is the most thorough evidence obtained so far for this provocative and stimulating evolutionary hypothesis.

    1. eLife assessment

      There is a growing interest in understanding the individuality of animal behaviours. In this article, the authors build and use an impressive array of high throughput phenotyping paradigms to examine the 'stability' (consistency) of behavioural characteristics in a range of contexts and over time. They find that certain behaviours are individualistic and persist robustly across external stimuli while others are less robust to these changing parameters. The data are solid and, with more appropriate statistical methods adopted, the findings have valuable implications for the study of individual variability.

    2. Reviewer #1 (Public Review):

      Summary:

      The authors state the study's goal clearly: "The goal of our study was to understand to what extent animal individuality is influenced by situational changes in the environment, i.e., how much of an animal's individuality remains after one or more environmental features change." They use visually guided behavioral features to examine the extent of correlation over time and in a variety of contexts. They develop new behavioral instrumentation and software to measure behavior in Buridan's paradigm (and variations thereof), the Y-maze, and a flight simulator. Using these assays, they examine the correlations between conditions for a panel of locomotion parameters. They propose that inter-assay correlations will determine the persistence of locomotion individuality.

      Strengths:

      The OED defines individuality as "the sum of the attributes which distinguish a person or thing from others of the same kind," a definition mirrored by other dictionaries and the scientific literature on the topic. The concept of behavioral individuality can be characterized as:<br /> (1) a large set of behavioral attributes,<br /> (2) with inter-individual variability, that are<br /> (3) stable over time.

      A previous study examined walking parameters in Buridan's paradigm, finding that several parameters were variable between individuals, and that these showed stability over separate days and up to 4 weeks (DOI: 10.1126/science.aaw718). The present study replicates some of those findings and extends the experiments from temporal stability to examining the correlation of locomotion features between different contexts.

      The major strength of the study is using a range of different behavioral assays to examine the correlations of several different behavior parameters. It shows clearly that the inter-individual variability of some parameters is at least partially preserved between some contexts, and not preserved between others. The development of high-throughput behavior assays and sharing the information on how to make the assays is a commendable contribution.

      Weaknesses:

      The definition of individuality considers a comprehensive or large set of attributes, but the authors consider only a handful. In Supplemental Fig. S8, the authors show a large correlation matrix of many behavioral parameters, but these are illegible and are only mentioned briefly in Results. Why were five or so parameters selected from the full set? How were these selected? Do the correlation trends hold true across all parameters? For assays in which only a subset of parameters can be directly compared, were all of these included in the analysis, or only a subset?

      The correlation analysis is used to establish stability between assays. For temporal re-testing, "stability" is certainly the appropriate word, but between contexts, it implies that there could be 'instability'. Rather, instead of the 'instability' of a single brain process, a different behavior in a different context could arise from engaging largely (or entirely?) distinct context-dependent internal processes, and have nothing to do with process stability per se. For inter-context similarities, perhaps a better word would be "consistency".

      The parameters are considered one by one, not in aggregate. This focuses on the stability/consistency of the variability of a single parameter at a time, rather than holistic individuality. It would appear that an appropriate measure of individuality stability (or individuality consistency) that accounts for the high-dimensional nature of individuality would somehow summarize correlations across all parameters. Why was a multivariate approach (e.g. multiple regression/correlation) not used? Treating the data with a multivariate or averaged approach would allow the authors to directly address 'individuality stability', along with the analyses of single-parameter variability stability.

      The correlation coefficients are sometimes quite low, though highly significant, and are deemed to indicate stability. For example, in Figure 4C top left, the % of time walked at 23{degree sign}C and 32{degree sign}C are correlated by 0.263, which corresponds to an R2 of 0.069 i.e. just 7% of the 32{degree sign}C variance is predictable by the 23{degree sign}C variance. Is it fair to say that a 7% determination indicates parameter stability? Another example: "Vector strength was the most correlated attention parameter... correlations ranged... to -0.197," which implies that 96% (1 - R2) of Y-maze variance is not predicted by Buridan variance. At what level does an r value not represent stability?

      The authors describe a dissociation between inter-group differences and inter-individual variation stability, i.e. sometimes large mean differences between contexts, but significant correlation between individual test and retest data. Given that correlation is sensitive to slope, this might be expected to underestimate the variability stability (or consistency). Is there a way to adjust for the group differences before examining the correlation? For example, would it be possible to transform the values to in-group ranks prior to correlation analysis?

      What is gained by classifying the five parameters into exploration, attention, and anxiety? To what extent have these classifications been validated, both in general and with regard to these specific parameters? Is the increased walking speed at higher temperatures necessarily due to an increased 'explorative' nature, or could it be attributed to increased metabolism, dehydration stress, or a heat-pain response? To what extent are these categories subjective?

      The legends are quite brief and do not link to descriptions of specific experiments. For example, Figure 4a depicts a graphical overview of the procedure, but I could not find a detailed description of this experiment's protocol.

      Using the current single-correlation analysis approach, the aims would benefit from re-wording to appropriately address single-parameter variability stability/consistency (as distinct from holistic individuality). Alternatively, the analysis could be adjusted to address the multivariate nature of individuality, so that the claims and the analysis are in concordance with each other.

      The study presents a bounty of new technology to study visually guided behaviors. The GitHub link to the software was not available. To verify the successful transfer of open hardware and open-software, a report would demonstrate transfer by collaboration with one or more other laboratories, which the present manuscript does not appear to do. Nevertheless, making the technology available to readers is commendable.

      The study discusses a number of interesting, stimulating ideas about inter-individual variability, and presents intriguing data that speaks to those ideas, albeit with the issues outlined above.

      While the current work does not present any mechanistic analysis of inter-individual variability, the implementation of high-throughput assays sets up the field to more systematically investigate fly visual behaviors, their variability, and their underlying mechanisms.

    3. Reviewer #2 (Public Review):

      Summary:

      The authors repeatedly measured the behavior of individual flies across several environmental situations in custom-made behavioral phenotyping rigs.

      Strengths:

      The study uses several different behavioral phenotyping devices to quantify individual behavior in a number of different situations and over time. It seems to be a very impressive amount of data. The authors also make all their behavioral phenotyping rig design and tracking software available, which I think is great and I'm sure other folks will be interested in using and adapting it to their own needs.

      Weaknesses/Limitations:

      I think an important limitation is that while the authors measured the flies under different environmental scenarios (i.e. with different lighting and temperature) they didn't really alter the "context" of the environment. At least within behavioral ecology, context would refer to the potential functionality of the expressed behaviors so for example, an anti-predator context, a mating context, or foraging. Here, the authors seem to really just be measuring aspects of locomotion under benign (relatively low-risk perception) contexts. This is not a flaw of the study, but rather a limitation to how strongly the authors can really say that this demonstrates that individuality is generalized across many different contexts. It's quite possible that rank order of locomotor (or other) behaviors may shift when the flies are in a mating or risky context.

      The analytical framework in terms of statistical methods is lacking. It appears as though the authors used correlations across time/situations to estimate individual variation; however, far more sophisticated and elegant methods exist. The paper would be a lot stronger, and my guess is, much more streamlined if the authors employ hierarchical mixed models to analyse these data these models could capture and estimate differences in individual behavior across time and situations simultaneously. Along with this, it's currently unclear whether and how any statistical inference was performed. Right now, it appears as though any results describing how individuality changes across situations are largely descriptive (i.e. a visual comparison of the strengths of the correlation coefficients?).

      Another pretty major weakness is that right now, I can't find any explicit mention of how many flies were used and whether they were re-used across situations. Some sort of overall schematic showing exactly how many measurements were made in which rigs and with which flies would be very beneficial.

      I don't necessarily doubt the robustness of the results and my guess is that the author's interpretations would remain the same, but a more appropriate modeling framework could certainly improve their statistical inference and likely highlight some other cool patterns as these methods could better estimate stability and covariance in individual intercepts (and potentially slopes) across time and situation.

    4. Reviewer #3 (Public Review):

      This manuscript is a continuation of past work by the last author where they looked at stochasticity in developmental processes leading to inter-individual behavioural differences. In that work, the focus was on a specific behaviour under specific conditions while probing the neural basis of the variability. In this work, the authors set out to describe in detail how stable the individuality of animal behaviours is in the context of various external and internal influences. They identify a few behaviours to monitor (read outs of attention, exploration, and 'anxiety'); some external stimuli (temperature, contrast, nature of visual cues, and spatial environment); and two internal states (walking and flying).

      They then use high-throughput behavioural arenas - most of which they have built and made plans available for others to replicate - to quantify and compare combinations of these behaviours, stimuli, and internal states. This detailed analysis reveals that:

      (1) Many individualistic behaviours remain stable over the course of many days.<br /> (2) That some of these (walking speed) remain stable over changing visual cues. Others (walking speed and centrophobicity) remain stable at different temperatures.<br /> (3) All the behaviours they tested failed to remain stable over the spatially varying environment (arena shape).<br /> (4) Only angular velocity (a readout of attention) remains stable across varying internal states (walking and flying).

      Thus, the authors conclude that there is a hierarchy in the influence of external stimuli and internal states on the stability of individual behaviours.

      The manuscript is a technical feat with the authors having built many new high-throughput assays. The number of animals is large and many variables have been tested - different types of behavioural paradigms, flying vs walking, varying visual stimuli, and different temperatures among others.

    1. eLife assessment

      Yonk and colleagues provide a valuable and timely study showcasing the role of thalamostriatal inputs on learning and action selection. In particular, they provide solid evidence that posterior medial thalamic nucleus (POm) neurons are activated during reward expectation and arousal. A clearer conceptual assessment of the overall function of this circuit, together with sharper analyses of calcium responses and thalamic specificity, in terms of viral spread and striatal target, may further increase the impact of the study.

    2. Reviewer #1 (Public Review):

      Summary:

      This work aims to understand the role of thalamus POm in dorsal lateral striatum (DLS) projection in learning a sensorimotor associative task. The authors first confirm that POm forms "en passant" synapses with some of the DLS neuronal subtypes. They then perform a go/no-go associative task that consists of the mouse learning to discriminate between two different textures and to associate one of them with an action. During this task, they either record the activity of the POm to DLS axons using endoscopy or silence their activity. They report that POm axons in the DLS are activated around the sensory stimulus but that the activity is not modulated by the reward. Last, they showed that silencing the POm axons at the level of DLS slows down learning the task.

      The authors show convincing evidence of projections from POm to DLS and that POm inputs to DLS code for whisking whatever the outcome of the task is. However, their results do not allow us to conclude if more neurons are recruited during the learning process or if the already activated fibres get activated more strongly. Last, because POm fibres in the DLS are also projecting to S1, silencing the POm fibres in the DLS could have affected inputs in S1 as well and therefore, the slowdown in acquiring the task is not necessarily specific to the POm to DLS pathway.

      Strengths:

      One of the main strengths of the paper is to go from slice electrophysiology to behaviour to get an in-depth characterization of one pathway. The authors did a comprehensive description of the POm projections to the DLS using transgenic mice to unambiguously identify the DLS neuronal population. They also used a carefully designed sensorimotor association task, and they exploited the results in depth.

      It is a very nice effort to have measured the activity of the axons in the DLS not only after the mice have learned the task but throughout the learning process. It shows the progressive increase of activity of POm axons in the DLS, which could imply that there is a progressive strengthening of the pathway. The results show convincingly that POm axons in the DLS are not activated by the outcome of the task but by the whisker activity, and that this activity on average increases with learning.

      Weaknesses:

      One of the main targets of the striatum from thalamic input are the cholinergic neurons that weren't investigated here, is there information that could be provided?

      It is interesting to know that the POm projects to all neuronal types in the DLS, but this information is not used further down the manuscript so the only take-home message of Figure 1 is that the axons that they image or silence in the DLS are indeed connected to DLS neurons and not just passing fibres. In this line, are these axons the same as the ones projecting to S1? If this is the case, why would we expect a different behaviour of the axon activity at the DLS level compared to S1?

      The authors used endoscopy to measure the POm axons in the DLS activity, which makes it impossible to know if the progressive increase of POm response is due to an increase of activity from each individual neuron or if new neurons are progressively recruited in the process.

      The picture presented in Figure 4 of the stimulation site is slightly concerning as there are hardly any fibres in neocortical layer 1 while there seems to be quite a lot of them in layer 4, suggesting that the animal here was injected in the VB. This is especially striking as the implantation and projection sites presented in Figures 1 and 2 are very clean and consistent with POm injection.

    3. Reviewer #2 (Public Review):

      Summary:

      Yonk and colleagues show that the posterior medial thalamus (POm), which is interconnected with sensory and motor systems, projects directly to major categories of neurons in the striatum, including direct and indirect pathway MSNs, and PV interneurons. Activity in POm-striatal neurons during a sensory-based learning task indicates a relationship between reward expectation and arousal. Inhibition of these neurons slows reaction to stimuli and overall learning. This circuit is positioned to feed salient event activation to the striatum to set the stage for effective learning and action selection.

      Strengths:

      The results are well presented and offer interesting insight into an understudied thalamostriatal circuit. In general, this work is important as part of a general need for an increased understanding of thalamostriatal circuits in complex learning and action selection processes, which have generally received less attention than corticostriatal systems.

      Weaknesses:

      There could be a stronger connection between the connectivity part of the data - showing that POm neurons context D1, D2, and PV neurons in the striatum but with some different properties - and the functional side of the project. One wonders whether the POm neurons projecting to these subtypes or striatal neurons have unique signaling properties related to learning, or if there is a uniform, bulk signal sent to the striatum. This is not a weakness per se, as it's reasonable for these questions to be answered in future papers.

      All the in vivo activity-related conclusions stem from data from just 5 mice, which is a relatively small sample set. Optogenetic groups are also on the small side.

    4. Reviewer #3 (Public Review):

      Yonk and colleagues investigate the role of the thalamostriatal pathway. Specifically, they studied the interaction of the posterior thalamic nucleus (PO) and the dorsolateral striatum in the mouse. First, they characterize connectivity by recording DLS neurons in in-vitro slices and optogenetically activating PO terminals. PO is observed to establish depressing synapses onto D1 and D2 spiny neurons as well as PV neurons. Second, the image PO axons are imaged by fiber photometry in mice trained to discriminate textures. Initially, no trial-locked activity is observed, but as the mice learn PO develops responses timed to the audio cue that marks the start of the trial and precedes touch. PO does appear to encode the tactile stimulus type or outcome. Optogenetic suppression of PO terminals in striatum slow task acquisition. The authors conclude that PO provides a "behaviorally relevant arousal-related signal" and that this signal "primes" striatal circuitry for sensory processing.

      A great strength of this paper is its timeliness. Thalamostriatal processing has received almost no attention in the past, and the field has become very interested in the possible functions of PO. Additionally, the experiments exploit multiple cutting-edge techniques.

      There seem to be some technical/analytical weaknesses. The in vitro experiments appear to have some contamination of nearby thalamic nuclei by the virus delivering the opsin, which could change the interpretation. Some of the statistical analyses of these data also appear inappropriate. The correlative analysis of Pom activity in vivo, licking, and pupil could be more convincingly done.

      The bigger weakness is conceptual - why should striatal circuitry need "priming" by the thalamus in order to process sensory stimuli? Why would such circuitry even be necessary? Why is a sensory signal from the cortex insufficient? Why should the animal more slowly learn the task? How does this fit with existing ideas of striatal plasticity? It is unclear from the experiments that the thalamostriatal pathway exists for priming sensory processing. In fact, the optogenetic suppression of the thalamostriatal pathway seems to speak against that idea.

    1. eLife assessment

      The valuable findings in this study reveal an intricate pattern of memory expression following retrieval extinction at different intervals from retrieval-extinction to test. The novel advance is in the demonstration that, relative to a standard extinction procedure, the retrieval-extinction procedure more effectively suppresses responses to a conditioned threat stimulus when testing occurs just minutes after extinction. The manuscript provides incomplete evidence in support of the attenuation of fear recovery and solid evidence for the engagement of the dorsolateral prefrontal cortex in this "short-term" suppression of responding.

    2. Reviewer #1 (Public Review):

      Summary:

      The novel advance by Wang et al is in the demonstration that, relative to a standard extinction procedure, the retrieval-extinction procedure more effectively suppresses responses to a conditioned threat stimulus when testing occurs just minutes after extinction. The authors provide some solid evidence to show that this "short-term" suppression of responding involves engagement of the dorsolateral prefrontal cortex.

      Strengths:

      Overall, the study is well-designed and the results are potentially interesting. There are, however, a few issues in the way that it is introduced and discussed. Some of the issues concern clarity of expression/communication. However, others relate to a theory that could be used to help the reader understand why the results should have come out the way that they did. More specific comments and questions are presented below.

      Weaknesses:

      INTRODUCTION & THEORY

      (1) Can the authors please clarify why the first trial of extinction in a standard protocol does NOT produce the retrieval-extinction effect? Particularly as the results section states: "Importantly, such a short-term effect is also retrieval dependent, suggesting the labile state of memory is necessary for the short-term memory update to take effect (Fig. 1e)." The importance of this point comes through at several places in the paper:

      1A. "In the current study, fear recovery was tested 30 minutes after extinction training, whereas the effect of memory reconsolidation was generally evident only several hours later and possibly with the help of sleep, leaving open the possibility of a different cognitive mechanism for the short-term fear dementia related to the retrieval-extinction procedure." ***What does this mean? The two groups in study 1 experienced a different interval between the first and second CS extinction trials; and the results varied with this interval: a longer interval (10 min) ultimately resulted in less reinstatement of fear than a shorter interval. Even if the different pattern of results in these two groups was shown/known to imply two different processes, there is absolutely no reason to reference any sort of cognitive mechanism or dementia - that is quite far removed from the details of the present study.

      1B. "Importantly, such a short-term effect is also retrieval dependent, suggesting the labile state of memory is necessary for the short-term memory update to take effect (Fig. 1e)." ***As above, what is "the short-term memory update"? At this point in the text, it would be appropriate for the authors to discuss why the retrieval-extinction procedure produces less recovery than a standard extinction procedure as the two protocols only differ in the interval between the first and second extinction trials. References to a "short-term memory update" process do not help the reader to understand what is happening in the protocol.

      (2) "Indeed, through a series of experiments, we identified a short-term fear amnesia effect following memory retrieval, in addition to the fear reconsolidation effect that appeared much later."<br /> ***The only reason for supposing two effects is because of the differences in responding to the CS2, which was subjected to STANDARD extinction, in the short- and long-term tests. More needs to be said about how and why the performance of CS2 is affected in the short-term test and recovers in the long-term test. That is, if the loss of performance to CS1 and CS2 is going to be attributed to some type of memory updating process across the retrieval-extinction procedure, one needs to explain the selective recovery of performance to CS2 when the extinction-to-testing interval extends to 24 hours. Instead of explaining this recovery, the authors note that performance to CS1 remains low when the extinction-to-testing interval is 24 hours and invoke something to do with memory reconsolidation as an explanation for their results: that is, they imply (I think) that reconsolidation of the CS1-US memory is disrupted across the 24-hour interval between extinction and testing even though CS1 evokes negligible responding just minutes after extinction.

      (3) The discussion of memory suppression is potentially interesting but, in its present form, raises more questions than it answers. That is, memory suppression is invoked to explain a particular pattern of results but I, as the reader, have no sense of why a fear memory would be better suppressed shortly after the retrieval-extinction protocol compared to the standard extinction protocol; and why this suppression is NOT specific to the cue that had been subjected to the retrieval-extinction protocol.

      3A. Relatedly, how does the retrieval-induced forgetting (which is referred to at various points throughout the paper) relate to the retrieval-extinction effect? The appeal to retrieval-induced forgetting as an apparent justification for aspects of the present study reinforces points 2 and 3 above. It is not uninteresting but needs some clarification/elaboration.

      (4) Given the reports by Chalkia, van Oudenhove & Beckers (2020) and Chalkia et al (2020), some qualification needs to be inserted in relation to reference 6. That is, reference 6 is used to support the statement that "during the reconsolidation window, old fear memory can be updated via extinction training following fear memory retrieval". This needs a qualifying statement like "[but see Chalkia et al (2020a and 2020b) for failures to reproduce the results of 6]."

      https://pubmed.ncbi.nlm.nih.gov/32580869/<br /> https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7115860/

      CLARIFICATIONS, ELABORATIONS, EDITS

      (5) The Abstract was not easy to follow:

      5A. What does it mean to ask: "whether memory retrieval facilitates update mechanisms other than memory reconsolidation"? That is, in what sense could or would memory retrieval be thought to facilitate a memory update mechanism?

      5B. "First, we demonstrate that memory reactivation prevents the return of fear shortly after extinction training in contrast to the memory reconsolidation effect which takes several hours to emerge and such a short-term amnesia effect is cue independent (Study 1, N = 57 adults)."<br /> ***The phrasing here could be improved for clarity: "First, we demonstrate that the retrieval-extinction protocol prevents the return of fear shortly after extinction training (i.e., when testing occurs just min after the end of extinction)." Also, cue-dependence of the retrieval-extinction effect was assessed in study 2.

      5C. "Furthermore, memory reactivation also triggers fear memory reconsolidation and produces cue-specific amnesia at a longer and separable timescale (Study 2, N = 79 adults)." ***In study 2, the retrieval-extinction protocol produced a cue-specific disruption in responding when testing occurred 24 hours after the end of extinction. This result is interesting but cannot be easily inferred from the statement that begins "Furthermore..." That is, the results should be described in terms of the combined effects of retrieval and extinction, not in terms of memory reactivation alone; and the statement about memory reconsolidation is unnecessary. One can simply state that the retrieval-extinction protocol produced a cue-specific disruption in responding when testing occurred 24 hours after the end of extinction.

      5D. "...we directly manipulated brain activities in the dorsolateral prefrontal cortex and found that both memory retrieval and intact prefrontal cortex functions were necessary for the short-term fear amnesia."<br /> ***This could be edited to better describe what was shown: E.g., "...we directly manipulated brain activities in the dorsolateral prefrontal cortex and found that intact prefrontal cortex functions were necessary for the short-term fear amnesia after the retrieval-extinction protocol."

      5E. "The temporal scale and cue-specificity results of the short-term fear amnesia are clearly dissociable from the amnesia related to memory reconsolidation, and suggest that memory retrieval and extinction training trigger distinct underlying memory update mechanisms."<br /> ***The pattern of results when testing occurred just minutes after the retrieval-extinction protocol was different from that obtained when testing occurred 24 hours after the protocol. Describing this in terms of temporal scale is unnecessary, and suggesting that memory retrieval and extinction trigger different memory update mechanisms is not obviously warranted. The results of interest are due to the combined effects of retrieval+extinction and there is no sense in which different memory update mechanisms should be identified with retrieval (mechanism 1) and extinction (mechanism 2).

      5F. "These findings raise the possibility of concerted memory modulation processes related to memory retrieval..."<br /> ***What does this mean?

      (6) "...suggesting that the fear memory might be amenable to a more immediate effect, in addition to what the memory reconsolidation theory prescribes..."<br /> ***What does it mean to say that the fear memory might be amenable to a more immediate effect?

      (7) "Parallel to the behavioral manifestation of long- and short-term memory deficits, concurrent neural evidence supporting memory reconsolidation theory emphasizes the long-term effect of memory retrieval by hypothesizing that synapse degradation and de novo protein synthesis are required for reconsolidation."<br /> ***This sentence needs to be edited for clarity.

      (8) "previous behavioral manipulations engendering the short-term declarative memory effect..."<br /> ***What is the declarative memory effect? It should be defined.

      (9) "The declarative amnesia effect emerges much earlier due to the online functional activity modulation..."<br /> ***Even if the declarative memory amnesia effect had been defined, the reference to online functional activity modulation is not clear.

      (10) "However, it remains unclear whether memory retrieval might also precipitate a short-term amnesia effect for the fear memory, in addition to the long-term prevention orchestrated by memory consolidation."<br /> ***I found this sentence difficult to understand on my first pass through the paper. I think it is because of the phrasing of memory retrieval. That is, memory retrieval does NOT precipitate any type of short-term amnesia for the fear memory: it is the retrieval-extinction protocol that produces something like short-term amnesia. Perhaps this sentence should also be edited for clarity.

      I will also note that the usage of "short-term" at this point in the paper is quite confusing: Does the retrieval-extinction protocol produce a short-term amnesia effect, which would be evidenced by some recovery of responding to the CS when tested after a sufficiently long delay? I don't believe that this is the intended meaning of "short-term" as used throughout the majority of the paper, right?

      (11) "To fully comprehend the temporal dynamics of the memory retrieval effect..."<br /> ***What memory retrieval effect? This needs some elaboration.

      (12) "We hypothesize that the labile state triggered by the memory retrieval may facilitate different memory update mechanisms following extinction training, and these mechanisms can be further disentangled through the lens of temporal dynamics and cue-specificities."<br /> ***What does this mean? The first part of the sentence is confusing around the usage of the term "facilitate"; and the second part of the sentence that references a "lens of temporal dynamics and cue-specificities" is mysterious. Indeed, as all rats received the same retrieval-extinction exposures in Study 2, it is not clear how or why any differences between the groups are attributed to "different memory update mechanisms following extinction".

      (13) "In the first study, we aimed to test whether there is a short-term amnesia effect of fear memory retrieval following the fear retrieval-extinction paradigm."<br /> ***Again, the language is confusing. The phrase, "a short-term amnesia effect" implies that the amnesia itself is temporary; but I don't think that this implication is intended. The problem is specifically in the use of the phrase "a short-term amnesia effect of fear memory retrieval." To the extent that short-term amnesia is evident in the data, it is not due to retrieval per se but, rather, the retrieval-extinction protocol.

      (14) The authors repeatedly describe the case where there was a 24-hour interval between extinction and testing as consistent with previous research on fear memory reconsolidation. Which research exactly? That is, in studies where a CS re-exposure was combined with a drug injection, responding to the CS was disrupted in a final test of retrieval from long-term memory which typically occurred 24 hours after the treatment. Is that what the authors are referring to as consistent? If so, which aspect of the results are consistent with those previous findings? Perhaps the authors mean to say that, in the case where there was a 24-hour interval between extinction and testing, the results obtained here are consistent with previous research that has used the retrieval-extinction protocol. This would clarify the intended meaning greatly.

      DATA

      (15) Points about data:

      15A. The eight participants who were discontinued after Day 1 in study 1 were all from the no-reminder group. Can the authors please comment on how participants were allocated to the two groups in this experiment so that the reader can better understand why the distribution of non-responders was non-random (as it appears to be)?

      15B. Similarly, in study 2, of the 37 participants that were discontinued after Day 2, 19 were from Group 30 min, and 5 were from Group 6 hours. Can the authors comment on how likely these numbers are to have been by chance alone? I presume that they reflect something about the way that participants were allocated to groups, but I could be wrong.

      15C. "Post hoc t-tests showed that fear memories were resilient after regular extinction training, as demonstrated by the significant difference between fear recovery indexes of the CS+ and CS- for the no-reminder group (t26 = 7.441, P < 0.001; Fig. 1e), while subjects in the reminder group showed no difference of fear recovery between CS+ and CS- (t29 = 0.797, P = 0.432, Fig. 1e)."<br /> ***Is the fear recovery index shown in Figure 1E based on the results of the 1st test trial only? How can there have been a "significant difference between fear recovery indexes of the CS+ and CS- for the no-reminder group" when the difference in responding to the CS+ and CS- is used to calculate the fear recovery index shown in 1E? What are the t-tests comparing exactly, and what correction is used to account for the fact that they are applied post-hoc?

      15D. "Finally, there is no statistical difference between the differential fear recovery indexes between CS+ in the reminder and no reminder groups (t55 = -2.022, P = 0.048; Fig. 1c, also see Supplemental Material for direct test for the test phase)."<br /> ***Is this statement correct - i.e., that there is no statistically significant difference in fear recovery to the CS+ in the reminder and no reminder groups? I'm sure that the authors would like to claim that there IS such a difference; but if such a difference is claimed, one would be concerned by the fact that it is coming through in an uncorrected t-test, which is the third one of its kind in this paragraph. What correction (for the Type 1 error rate) is used to account for the fact that the t-tests are applied post-hoc? And if no correction, why not?

      15E. In study 2, why is responding to the CS- so high on the first test trial in Group 30 min? Is the change in responding to the CS- from the last extinction trial to the first test trial different across the three groups in this study? Inspection of the figure suggests that it is higher in Group 30 min relative to Groups 6 hours and 24 hours. If this is confirmed by the analysis, it has implications for the fear recovery index which is partly based on responses to the CS-. If not for differences in the CS- responses, Groups 30 minutes and 6 hours are otherwise identical.

      15F. Was the 6-hour group tested at a different time of day compared to the 30-minute and 24-hour groups; and could this have influenced the SCRs in this group?

      15G. Why is the range of scores in "thought control ability" different in the 30-minute group compared to the 6-hour and 24-hour groups? I am not just asking about the scale on the x-axis: I am asking why the actual distribution of the scores in thought control ability is wider for the 30-minute group?

      (16) During testing in each experiment, how were the various stimuli presented? That is, was the presentation order for the CS+ and CS- pseudorandom according to some constraint, as it had been in extinction? This information should be added to the method section.

      (17) "These results are consistent with previous research which suggested that people with better capability to resist intrusive thoughts also performed better in motivated dementia in both declarative and associative memories."<br /> ***Which parts of the present results are consistent with such prior results? It is not clear from the descriptions provided here why thought control ability should be related to the present findings or, indeed, past ones in other domains. This should be elaborated to make the connections clear.

    3. Reviewer #2 (Public Review):

      Summary

      The study investigated whether memory retrieval followed soon by extinction training results in a short-term memory deficit when tested - with a reinstatement test that results in recovery from extinction - soon after extinction training. Experiment 1 documents this phenomenon using a between-subjects design. Experiment 2 used a within-subject control and saw that the effect was also observed in a control condition. In addition, it also revealed that if testing is conducted 6 hours after extinction, there is no effect of retrieval prior to extinction as there is recovery from extinction independently of retrieval prior to extinction. A third group also revealed that retrieval followed by extinction attenuates reinstatement when the test is conducted 24 hours later, consistent with previous literature. Finally, Experiment 3 used continuous theta-burst stimulation of the dorsolateral prefrontal cortex and assessed whether inhibition of that region (vs a control region) reversed the short-term effect revealed in Experiments 1 and 2. The results of the control groups in Experiment 3 replicated the previous findings (short-term effect), and the experimental group revealed that these can be reversed by inhibition of the dorsolateral prefrontal cortex.

      Strengths

      The work is performed using standard procedures (fear conditioning and continuous theta-burst stimulation) and there is some justification for the sample sizes. The results replicate previous findings - some of which have been difficult to replicate and this needs to be acknowledged - and suggest that the effect can also be observed in a short-term reinstatement test.

      The study establishes links between memory reconsolidation and retrieval-induced forgetting (or memory suppression) literature. The explanations that have been developed for these are distinct and the current results integrate these, by revealing that the DLPFC activity involved in retrieval-extinction short-term effect. There is thus some novelty in the present results, but numerous questions remain unaddressed.

      Weakness

      The fear acquisition data is converted to a differential fear SCR and this is what is analysed (early vs late). However, the figure shows the raw SCR values for CS+ and CS- and therefore it is unclear whether the acquisition was successful (despite there being an "early" vs "late" effect - no descriptives are provided).

      In Experiment 1 (Test results) it is unclear whether the main conclusion stems from a comparison of the test data relative to the last extinction trial ("we defined the fear recovery index as the SCR difference between the first test trial and the last extinction trial for a specific CS") or the difference relative to the CS- ("differential fear recovery index between CS+ and CS-"). It would help the reader assess the data if Figure 1e presents all the indexes (both CS+ and CS-). In addition, there is one sentence that I could not understand "there is no statistical difference between the differential fear recovery indexes between CS+ in the reminder and no reminder groups (P=0.048)". The p-value suggests that there is a difference, yet it is not clear what is being compared here. Critically, any index taken as a difference relative to the CS- can indicate recovery of fear to the CS+ or absence of discrimination relative to the CS-, so ideally the authors would want to directly compare responses to the CS+ in the reminder and no-reminder groups. The latter issue is particularly relevant in Experiment 2, in which the CS- seems to vary between groups during the test and this can obscure the interpretation of the result.

      In Experiment 1, the findings suggest that there is a benefit of retrieval followed by extinction in a short-term reinstatement test. In Experiment 2, the same effect is observed on a cue that did not undergo retrieval before extinction (CS2+), a result that is interpreted as resulting from cue-independence, rather than a failure to replicate in a within-subjects design the observations of Experiment 1 (between-subjects). Although retrieval-induced forgetting is cue-independent (the effect on items that are suppressed [Rp-] can be observed with an independent probe), it is not clear that the current findings are similar. Here, both cues have been extinguished and therefore been equally exposed during the critical stage.

      The findings in Experiment 2 suggest that the amnesia reported in Experiment 1 is transient, in that no effect is observed when the test is delayed by 6 hours. The phenomena whereby reactivated memories transition to extinguished memories as a function of the amount of exposure (or number of trials) is completely different from the phenomena observed here. In the former, the manipulation has to do with the number of trials (or the total amount of time) that the cues are exposed to. In the current study, the authors did not manipulate the number of trials but instead the retention interval between extinction and test. The finding reported here is closer to a "Kamin effect", that is the forgetting of learned information which is observed with intervals of intermediate length (Baum, 1968). Because the Kamin effect has been inferred to result from retrieval failure, it is unclear how this can be explained here. There needs to be much more clarity on the explanations to substantiate the conclusions.

      There are many results (Ryan et al., 2015) that challenge the framework that the authors base their predictions on (consolidation and reconsolidation theory), therefore these need to be acknowledged. Similarly, there are reports that failed to observe the retrieval-extinction phenomenon (Chalkia et al., 2020), and the work presented here is written as if the phenomenon under consideration is robust and replicable. This needs to be acknowledged.

      The parallels between the current findings and the memory suppression literature are speculated in the general discussion, and there is the conclusion that "the retrieval-extinction procedure might facilitate a spontaneous memory suppression process". Because one of the basic tenets of the memory suppression literature is that it reflects an "active suppression" process, there is no reason to believe that in the current paradigm, the same phenomenon is in place, but instead, it is "automatic". In other words, the conclusions make strong parallels with the memory suppression (and cognitive control) literature, yet the phenomena that they observed are thought to be passive (or spontaneous/automatic).<br /> Ultimately, it is unclear why 10 mins between the reminder and extinction learning will "automatically" suppress fear memories. Further down in the discussion, it is argued that "For example, in the well-known retrieval-induced forgetting (RIF) phenomenon, the recall of a stored memory can impair the retention of related long-term memory and this forgetting effect emerges as early as 20 minutes after the retrieval procedure, suggesting memory suppression or inhibition can occur in a more spontaneous and automatic manner". I did not follow with the time delay between manipulation and test (20 mins) would speak about whether the process is controlled or automatic.

      Among the many conclusions, one is that the current study uncovers the "mechanism" underlying the short-term effects of retrieval extinction. There is little in the current report that uncovers the mechanism, even in the most psychological sense of the mechanism, so this needs to be clarified. The same applies to the use of "adaptive".

      Whilst I could access the data on the OFS site, I could not make sense of the Matlab files as there is no signposting indicating what data is being shown in the files. Thus, as it stands, there is no way of independently replicating the analyses reported.

      The supplemental material shows figures with all participants, but only some statistical analyses are provided, and sometimes these are different from those reported in the main manuscript. For example, the test data in Experiment 1 is analysed with a two-way ANOVA with the main effects of group (reminder vs no-reminder) and time (last trial of extinction vs first trial of the test) in the main report. The analyses with all participants in the sup mat used a mixed two-way ANOVA with a group (reminder vs no reminder) and CS (CS+ vs CS-). This makes it difficult to assess the robustness of the results when including all participants. In addition, in the supplementary materials, there are no figures and analyses for Experiment 3.

      One of the overarching conclusions is that the "mechanisms" underlying reconsolidation (long term) and memory suppression (short term) phenomena are distinct, but memory suppression phenomena can also be observed after a 7-day retention interval (Storm et al., 2012), which then questions the conclusions achieved by the current study.

      References:

      Baum, M. (1968). Reversal learning of an avoidance response and the Kamin effect. Journal of Comparative and Physiological Psychology, 66(2), 495.<br /> Chalkia, A., Schroyens, N., Leng, L., Vanhasbroeck, N., Zenses, A. K., Van Oudenhove, L., & Beckers, T. (2020). No persistent attenuation of fear memories in humans: A registered replication of the reactivation-extinction effect. Cortex, 129, 496-509.<br /> Ryan, T. J., Roy, D. S., Pignatelli, M., Arons, A., & Tonegawa, S. (2015). Engram cells retain memory under retrograde amnesia. Science, 348(6238), 1007-1013.<br /> Storm, B. C., Bjork, E. L., & Bjork, R. A. (2012). On the durability of retrieval-induced forgetting. Journal of Cognitive Psychology, 24(5), 617-629.

    4. Reviewer #3 (Public Review):

      SUMMARY

      Wang et al. have addressed how acquired fear and extinction memories evolve over time. Using a retrieval-extinction procedure in healthy humans, they have investigated the recovery of fear memories 30-60 minutes., 6 hours, and 24 hours after the retrieval-extinction phase. They have addressed this research question through 3 different experiments which included manipulations of the reminder cue, the time interval, and brain activity. Together, the studies suggest that early on after retrieval-extinction (30-60 min. later), retrieval-extinction may lead to an attenuation of fear recovery (after reinstatement) for all fear cues, as well as the non-reminded ones. Study 3 moreover suggests that this effect may depend on normal dlPFC function. In addition, the paper also contains data in line with prior findings suggesting that a 6-hour interval does not benefit from the reminder cue, and that a 24-hour interval does, and specifically for the reminded fear cue. The latter findings are seen as evidence of fear memory reconsolidation.

      STRENGTHS

      (1) The paper combines three related human fear conditioning studies, each with decent sample sizes. The authors are transparent about the fact that they excluded many participants and about which conditions they belonged to.

      (2) The effect that this paper investigates (short-term fear memory after a retrieval-extinction procedure) has not been studied extensively, thus making it a relevant topic.

      (3) The application of brain stimulation as a means to study causal relationships is interesting and goes beyond the purely behavioral or pharmacological interventions that are often used in human fear conditioning research. Also, the use of an active control stimulation is a strength of the study.

      WEAKNESSES

      (1) The entire study hinges on the idea that there is memory 'suppression' if (1) the CS+ was reminded before extinction and (2) the reinstatement and memory test takes place 30 minutes later (in Studies 1 & 2). However, the evidence supporting this suppression idea is not very strong. In brief, in Study 1, the effect seems to only just reach significance, with a medium effect size at best, and, moreover, it is unclear if this is the correct analysis (which is a bit doubtful, when looking at Figure 1D and E). In Study 2, there was no optimal control condition without reminder and with the same 30-min interval (which is problematic, because we can assume generalization between CS1+ and CS2+, as pointed out by the authors, and because generalization effects are known to be time-dependent). Study 3 is more convincing, but entails additional changes in comparison with Studies 1 and 2, i.e., applications of cTBS and an interval of 1 hour instead of 30 minutes (the reason for this change was not explained). So, although the findings of the 3 studies do not contradict each other and are coherent, they do not all provide strong evidence for the effect of interest on their own.

      Related to the comment above, I encourage the authors to double-check if this statement is correct: "Also, our results remain robust even with the "non-learners" included in the analysis (Fig. S1 in the Supplemental Material)". The critical analysis for Study 1 is a between-group comparison of the CS+ and CS- during the last extinction trial versus the first test trial. This result only just reached significance with the selected sample (p = .048), and Figures 1D and E even seem to suggest otherwise. I doubt that the analysis would reach significance when including the "non-learners" - assuming that this is what is shown in Supplemental Figure 1 (which shows the data from "all responded participants").

      Also related to the comment above, I think that the statement "suggesting a cue-independent short-term amnesia effect" in Study 2 is not correct and should read: "suggesting extinction of fear to the CS1+ and CS2+", given that the response to the CS+'s is similar to the response to the CS-, as was the case at the end of extinction. Also the next statement "This result indicates that the short-term amnesia effect observed in Study 2 is not reminder-cue specific and can generalize to the non-reminded cues" is not fully supported by the data, given the lack of an appropriate control group in this study (a group without reinstatement). The comparison with the effect found in Study 1 is difficult because the effect found there was relatively small (and may have to be double-checked, see remarks above), and it was obtained with a different procedure using a single CS+. The comparison with the 6-h and 24-h groups of Study 2 is not helpful as a control condition for this specific question (i.e., is there reinstatement of fear for any of the CS+'s) because of the large procedural difference with regard to the intervals between extinction and reinstatement (test).

      (2) It is unclear which analysis is presented in Figure 3. According to the main text, it either shows the "differential fear recovery index between CS+ and CS-" or "the fear recovery index of both CS1+ and CS2+". The authors should clarify what they are analyzing and showing, and clarify to which analyses the ** and NS refer in the graphs. I would also prefer the X-axes and particularly the Y-axes of Fig. 3a-b-c to be the same. The image is a bit misleading now. The same remarks apply to Figure 5.

      (3) In general, I think the paper would benefit from being more careful and nuanced in how the literature and findings are represented. First of all, the authors may be more careful when using the term 'reconsolidation'. In the current version, it is put forward as an established and clearly delineated concept, but that is not the case. It would be useful if the authors could change the text in order to make it clear that the reconsolidation framework is a theory, rather than something that is set in stone (see e.g., Elsey et al., 2018 (https://doi.org/10.1037/bul0000152), Schroyens et al., 2022 (https://doi.org/10.3758/s13423-022-02173-2)).

      In addition, the authors may want to reconsider if they want to cite Schiller et al., 2010 (https://doi.org/10.1038/nature08637), given that the main findings of this paper, nor the analyses could be replicated (see, Chalkia et al., 2020 (https://doi.org/10.1016/j.cortex.2020.04.017; https://doi.org/10.1016/j.cortex.2020.03.031).

      Relatedly, it should be clarified that Figure 6 is largely speculative, rather than a proven model as it is currently presented. This is true for all panels, but particularly for panel c, given that the current study does not provide any evidence regarding the proposed reconsolidation mechanism.

      Lastly, throughout the paper, the authors equate skin conductance responses (SCR) with fear memory. It should at least be acknowledged that SCR is just one aspect of a fear response, and that it is unclear whether any of this would translate to verbal or behavioral effects. Such effects would be particularly important for any clinical application, which the authors put forward as the ultimate goal of the research.

      (4) The Discussion quite narrowly focuses on a specific 'mechanism' that the authors have in mind. Although it is good that the Discussion is to the point, it may be worthwhile to entertain other options or (partial) explanations for the findings. For example, have the authors considered that there may be an important role for attention? When testing very soon after the extinction procedure (and thus after the reminder), attentional processes may play an important role (more so than with longer intervals). The retrieval procedure could perhaps induce heightened attention to the reminded CS+ (which could be further enhanced by dlPFC stimulation)?

      (5) There is room for improvement in terms of language, clarity of the writing, and (presentation of the) statistical analyses, for all of which I have provided detailed feedback in the 'Recommendations for the authors' section. Idem for the data availability; they are currently not publicly available, in contrast with what is stated in the paper. In addition, it would be helpful if the authors would provide additional explanation or justification for some of the methodological choices (e.g., the 18-s interval and why stimulate 8 minutes after the reminder cue, the choice of stimulation parameters), and comment on reasons for (and implications of) the large amount of excluded participants (>25%).

      Finally, I think several statements made in the paper are overly strong in light of the existing literature (or the evidence obtained here) or imply causal relationships that were not directly tested.

    1. Reviewer #1 (Public Review):

      Summary:

      This is a large cohort of ischemic stroke patients from a single centre. The author successfully set up predictive models for PTS.

      Strengths:

      The design and implementation of the trial are acceptable, and the results are credible. It may provide evidence of seizure prevention in the field of stroke treatment.

      Weaknesses:

      The methodology needs further consideration. The Discussion needs extensive rewriting.

    2. Reviewer #2 (Public Review):

      Summary

      The authors present multiple machine-learning methodologies to predict post-stroke epilepsy (PSE) from admission clinical data.

      Strengths

      The Statistical Approach section is very well written. The approaches used in this section are very sensible for the data in question.

      Weaknesses

      There are many typos and unclear statements throughout the paper.

      There are some issues with SHAP interpretation. SHAP in its default form, does not provide robust statistical guarantees of effect size. There is a claim that "SHAP analysis showed that white blood cell count had the greatest impact among the routine blood test parameters". This is a difficult claim to make.

      The Data Collection section is very poorly written, and the methodology is not clear.

      There is no information about hyperparameter selection for models or whether a hyperparameter search was performed. Given this, it is difficult to conclude whether one machine learning model performs better than others on this task.

      The inclusion and exclusion criteria are unclear - how many patients were excluded and for what reasons?

      There is no sensitivity analysis of the SMOTE methodology: How many synthetic data points were created, and how does the number of synthetic data points affect classification accuracy?

      Did the authors achieve their aims? Do the results support their conclusions?

      The paper does not clarify the features' temporal origins. If some features were not recorded on admission to the hospital but were recorded after PSE occurred, there would be temporal leakage.

      The authors claim that their models can predict PSE. To believe this claim, seeing more information on out-of-distribution generalisation performance would be helpful. There is limited reporting on the external validation cohort relative to the reporting on train and test data.

      For greater certainty on all reported results, it would be most appropriate to perform n-fold cross-validation, and report mean scores and confidence intervals across the cross-validation splits

      The likely impact of the work on the field

      If this model works as claimed, it will be useful for predicting PSE. This has some direct clinical utility.

      Analysis of features contributing to PSE may provide clinical researchers with ideas for further research on the underlying aetiology of PSE.

      Additional context that might help readers

      The authors show force plots and decision plots from SHAP values. These plots are non-trivial to interpret, and the authors should include an explanation of how to interpret them.

    3. Reviewer #3 (Public Review):

      Summary:

      The authors report the performance of a series of machine learning models inferred from a large-scale dataset and externally validated with an independent cohort of patients, to predict the risk of post-stroke epilepsy. Some of the reported models have very good explicative and predictive performances.

      Strengths:

      The models have been derived from real-world large-scale data.

      Performances of the best-performing models are very good according to the external validation results.

      Early prediction of the risk of post-stroke epilepsy would be of high interest to implement early therapeutic interventions that could improve prognosis.

      Weaknesses:

      There are issues with the readability of the paper. Many abbreviations are not introduced properly and sometimes are written inconsistently. A lot of relevant references are omitted. The methodological descriptions are extremely brief and, sometimes, incomplete.

      The dataset is not disclosed, and neither is the code (although the code is made available upon request). For the sake of reproducibility, unless any bioethical concerns impede it, it would be good to have these data disclosed.

      Although the external validation is appreciated, cross-validation to check the robustness of the models would also be welcome.

    1. eLife assessment

      This important study reveals a neural signature of a common behavioural phenomenon: serial dependence, whereby estimates of a visual feature (here motion direction) are attracted towards the recent history of encoded and reported stimuli. The study provides solid evidence that this phenomenon arises primarily during working memory maintenance. The pervasiveness of serial dependencies across modalities and species makes these findings important for researchers interested in perceptual decision-making across subfields.

    2. Reviewer #1 (Public Review):

      This study uses MEG to test for a neural signature of the trial history effect known as 'serial dependence.' This is a behavioral phenomenon whereby stimuli are judged to be more similar than they really are, in feature space, to stimuli that were relevant in the recent past (i.e., the preceding trials). This attractive bias is prevalent across stimulus classes and modalities, but a neural source has been elusive. This topic has generated great interest in recent years, and I believe this study makes a unique contribution to the field. The paper is overall clear and compelling, and makes effective use of data visualizations to illustrate the findings. Below, I list several points where I believe further detail would be important to interpreting the results. I also make suggestions for additional analyses that I believe would enrich understanding but are inessential to the main conclusions.

      (1) In the introduction, I think the study motivation could be strengthened, to clarify the importance of identifying a neural signature here. It is clear that previous studies have focused mainly on behavior, and that the handful of neuroscience investigations have found only indirect signatures. But what would the type of signature being sought here tell us? How would it advance understanding of the underlying processes, the function of serial dependence, or the theoretical debates around the phenomenon?

      (1a) As one specific point of clarification, on p. 5, lines 91-92, a previous study (St. John-Saaltink et al.) is described as part of the current study motivation, stating that "as the current and previous orientations were either identical or orthogonal to each other, it remained unclear whether this neural bias reflected an attraction or repulsion in relation to the past." I think this statement could be more explicit as to why/how these previous findings are ambiguous. The St. John-Saaltink study stands as one of very few that may be considered to show evidence of an early attractive effect in neural activity, so it would help to clarify what sort of advance the current study represents beyond that.

      (1b) The study motivation might also consider the findings of Ranieri et al (2022, J. Neurosci) Fornaciai, Togoli, & Bueti (2023, J. Neurosci), and Luo & Collins (2023, J. Neurosci) who all test various neural signatures of serial dependence.

      (2) Regarding the methods and results, it would help if the initial description of the reconstruction approach, in the main text, gave more context about what data is going into reconstruction (e.g., which sensors), a more conceptual overview of what the 'reconstruction' entails, and what the fidelity metric indexes. To me, all of that is important to interpreting the figures and results. For instance, when I first read, it was unclear to me what it meant to "reconstruct the direction of S1 during the S2 epoch" (p. 10, line 199)? As in, I couldn't tell how the data/model knows which item it is reconstructing, as opposed to just reporting whatever directional information is present in the signal.

      (2a) Relatedly, what does "reconstruction strength" reflect in Figure 2a? Is this different than the fidelity metric? Does fidelity reflect the strength of the particular relevant direction, or does it just mean that there is a high level of any direction information in the signal?

      (3) Then in the Methods, it would help to provide further detail still about the IEM training/testing procedure. For instance, it's not entirely clear to me whether all the analyses use the same model (i.e., all trained on stimulus encoding) or whether each epoch and timepoint is trained on the corresponding epoch and timepoint from the other session. This speaks to whether the reconstructions reflect a shared stimulus code across different conditions vs. that stimulus information about various previous and current trial items can be extracted if the model is tailored accordingly. Specifically, when you say "aim of the reconstruction" (p. 31, line 699), does that simply mean the reconstruction was centered in that direction (that the same data would go into reconstructing S1 or S2 in a given epoch, and what would differentiate between them is whether the reconstruction was centered to the S1 or S2 direction value)? Or were S1 and S2 trained and tested separately for the same epoch? And was training and testing all within the same time point (i.e., train on delay, test on delay), or train on the encoding of a given item, then test the fidelity of that stimulus code under various conditions?

      (3a) I think training and testing were done separately for each epoch and timepoint, but this could have important implications for interpreting the results. Namely if the models are trained and tested on different time points, and reference directions, then some will be inherently noisier than others (e.g., delay period more so than encoding), and potentially more (or differently) susceptible to bias. For instance, the S1 and S2 epochs show no attractive bias, but they may also be based on more high-fidelity training sets (i.e., encoding), and therefore less susceptible to the bias that is evident in the retrocue epoch.

      (4) I believe the work would benefit from a further effort to reconcile these results with previous findings (i.e., those that showed repulsion, like Sheehan & Serences), potentially through additional analyses. The discussion attributes the difference in findings to the "combination of a retro-cue paradigm with the high temporal resolution of MEG," but it's unclear how that explains why various others observed repulsion (thought to happen quite early) that is not seen at any stage here. In my view, the temporal (as well as spatial) resolution of MEG could be further exploited here to better capture the early vs. late stages of processing. For instance, by separately examining earlier vs. later time points (instead of averaging across all of them), or by identifying and analyzing data in the sensors that might capture early vs. late stages of processing. Indeed, the S1 and S2 reconstructions show subtle repulsion, which might be magnified at earlier time points but then shift (toward attraction) at later time points, thereby counteracting any effect. Likewise, the S1 reconstruction becomes biased during the S2 epoch, consistent with previous observations that the SD effects grow across a WM delay. Maybe both S1 and S2 would show an attractive bias emerging during the later (delay) portion of their corresponding epoch? As is, the data nicely show that an attractive bias can be detected in the retrocue period activity, but they could still yield further specificity about when and where that bias emerges.

      (5) A few other potentially interesting (but inessential considerations): A benchmark property of serial dependence is its feature-specificity, in that the attractive bias occurs only between current and previous stimuli that are within a certain range of similarity to each other in feature space. I would be very curious to see if the neural reconstructions manifest this principle - for instance, if one were to plot the trialwise reconstruction deviation from 0, across the full space of current-previous trial distances, as in the behavioral data. Likewise, something that is not captured by the DoG fitting approach, but which this dataset may be in a position to inform, is the commonly observed (but little understood) repulsive effect that appears when current and previous stimuli are quite distinct from each other. As in, Figure 1b shows an attractive bias for direction differences around 30 degrees, but a repulsive one for differences around 170 degrees - is there a corresponding neural signature for this component of the behavior?

    3. Reviewer #2 (Public Review):

      Summary:

      The study aims to probe the neural correlates of visual serial dependence - the phenomenon that estimates of a visual feature (here motion direction) are attracted towards the recent history of encoded and reported stimuli. The authors utilize an established retro-cue working memory task together with magnetoencephalography, which allows to probe neural representations of motion direction during encoding and retrieval (retro-cue) periods of each trial. The main finding is that neural representations of motion direction are not systematically biased during the encoding of motion stimuli, but are attracted towards the motion direction of the previous trial's target during the retrieval (retro-cue period), just prior to the behavioral response. By demonstrating a neural signature of attractive biases in working memory representations, which align with attractive behavioral biases, this study highlights the importance of post-encoding memory processes in visual serial dependence.

      Strengths:

      The main strength of the study is its elegant use of a retro-cue working memory task together with high temporal resolution MEG, enabling to probe neural representations related to stimulus encoding and working memory. The behavioral task elicits robust behavioral serial dependence and replicates previous behavioral findings by the same research group. The careful neural decoding analysis benefits from a large number of trials per participant, considering the slow-paced nature of the working memory paradigm. This is crucial in a paradigm with considerable trial-by-trial behavioral variability (serial dependence biases are typically small, relative to the overall variability in response errors). While the current study is broadly consistent with previous studies showing that attractive biases in neural responses are absent during stimulus encoding (previous studies reported repulsive biases), to my knowledge it is the first study showing attractive biases in current stimulus representations during working memory. The study also connects to previous literature showing reactivations of previous stimulus representations, although the link between reactivations and biases remains somewhat vague in the current manuscript. Together, the study reveals an interesting avenue for future studies investigating the neural basis of visual serial dependence.

      Weaknesses:

      The main weakness of the current manuscript is that the authors could have done more analyses to address the concern that their neural decoding results are driven by signals related to eye movements. The authors show that participants' gaze position systematically depended on the current stimuli's motion directions, which together with previous studies on eye movement-related confounds in neural decoding justifies such a concern. The authors seek to rule out this confound by showing that the consistency of stimulus-dependent gaze position does not correlate with (a) the neural reconstruction fidelity and (b) the repulsive shift in reconstructed motion direction. However, both of these controls do not directly address the concern. If I understand correctly the metric quantifying the consistency of stimulus-dependent gaze position (Figure S3a) only considers gaze angle and not gaze amplitude. Furthermore, it does not consider gaze position as a function of continuous motion direction, but instead treats motion directions as categorical variables. Therefore, assuming an eye movement confound, it is unclear whether the gaze consistency metric should strongly correlate with neural reconstruction fidelity, or whether there are other features of eye movements (e.g., amplitude differences across participants, and tuning of gaze in the continuous space of motion directions) which would impact the relationship with neural decoding. Moreover, it is unclear whether the consistency metric, which does not consider history dependencies in eye movements, should correlate with attractive history biases in neural decoding. It would be more straightforward if the authors would attempt to (a) directly decode stimulus motion direction from x-y gaze coordinates and relate this decoding performance to neural reconstruction fidelity, and (b) investigate whether gaze coordinates themselves are history-dependent and are attracted to the average gaze position associated with the previous trials' target stimulus. If the authors could show that (b) is not the case, I would be much more convinced that their main finding is not driven by eye movement confounds.

      I am not convinced by the across-participant correlation between attractive biases in neural representations and attractive behavioral biases in estimation reports. One would expect a correlation with the behavioral bias amplitude, which is not borne out. Instead, there is a correlation with behavioral bias width, but no explanation of how bias width should relate to the bias in neural representations. The authors could be more explicit in their arguments about how these metrics would be functionally related, and why there is no correlation with behavioral bias amplitude.

      The sample size (n = 10) is definitely at the lower end of sample sizes in this field. The authors collected two sessions per participant, which partly alleviates the concern. However, given that serial dependencies can be very variable across participants, I believe that future studies should aim for larger sample sizes.

      It would have been great to see an analysis in source space. As the authors mention in their introduction, different brain areas, such as PPC, mPFC, and dlPFC have been implicated in serial biases. This begs the question of which brain areas contribute to the serial dependencies observed in the current study. For instance, it would be interesting to see whether attractive shifts in current representations and pre-stimulus reactivations of previous stimuli are evident in the same or different brain areas.

    4. Reviewer #3 (Public Review):

      Summary:

      This study identifies the neural source of serial dependence in visual working memory, i.e., the phenomenon that recall from visual working memory is biased towards recently remembered but currently irrelevant stimuli. Whether this bias has a perceptual or post-perceptual origin has been debated for years - the distinction is important because of its implications for the neural mechanism and ecological purpose of serial dependence. However, this is the first study to provide solid evidence based on human neuroimaging that identifies a post-perceptual memory maintenance stage as the source of the bias. The authors used multivariate pattern analysis of magnetoencephalography (MEG) data while observers remembered the direction of two moving dot stimuli. After one of the two stimuli was cued for recall, decoding of the cued motion direction re-emerged, but with a bias towards the motion direction cued on the previous trial. By contrast, decoding of the stimuli during the perceptual stage was not biased.

      Strengths:

      The strengths of the paper are its design, which uses a retrospective cue to clearly distinguish the perceptual/encoding stage from the post-perceptual/maintenance stage, and the rigour of the careful and well-powered analysis. The study benefits from high within-participant power through the use of sensitive MEG recordings (compared to the more common EEG), and the decoding and neural bias analysis are done with care and sophistication, with appropriate controls to rule out confounds.

      Weaknesses:

      A minor weakness of the study is the remaining (but slight) possibility of an eye movement confound. A control analysis shows that participants make systematic eye movements that are aligned with the remembered motion direction during both the encoding and maintenance phases of the task. The authors go some way to show that this eye gaze bias seems unrelated to the decoding of MEG data, but in my opinion do not rule it out conclusively. They merely show that the strengths of the gaze bias and the strength of MEG-based decoding/neural bias are uncorrelated across the 10 participants. Therefore, this argument seems to rest on a null result from an underpowered analysis.

      Impact:

      This important study contributes to the debate on serial dependence with solid evidence that biased neural representations emerge only at a relatively late post-perceptual stage, in contrast to previous behavioural studies. This finding is of broad relevance to the study of working memory, perception, and decision-making by providing key experimental evidence favouring one class of computational models of how stimulus history affects the processing of the current environment.

    1. eLife assessment

      In this important study, the authors provide evidence that Treacle, a disease-relevant intrinsically disordered protein, undergoes biomolecular condensation to support the structure and function of the fibrillar center of the nucleolus. The findings, arising from complementary approaches, provide solid evidence for the role of Treacle condensation in supporting rDNA transcription, rRNA processing, and genome integrity. These findings may be of interest to the communities studying biomolecular condensates, nucleolar organization, and ribosome biogenesis.

    1. Reviewer #1 (Public Review):

      Summary:

      In this study, the authors developed an organoid system that contains smooth muscle cells (SMCs) and interstitial cells of Cajal (ICCs; pacemaker) but few enteric neurons, and generates rhythmic contractions as seen in the developing gut. The stereotypical arrangements of SMCs and ICCs in the organoid allowed the authors to identify these cell types in the organoid without antibody staining. The authors took advantage of this and used calcium imaging and pharmacology to study how calcium transients develop in this system through the interaction between the two types of cells. The authors first show that calcium transients are synchronized between ICC-ICC, SMC-SMC, and SMC-ICC. They then used gap junction inhibitors to suggest that gap junctions are specifically involved in ICC-to-SMC signaling. Finally, the authors used an inhibitor of myosin II to suggest that feedback from SMC contraction is crucial for the generation of rhythmic activities in ICCs. The authors also show that two organoids become synchronized as they fuse and SMCs mediate this synchronization.

      Strengths:

      The organoid system offers a useful model in which one can study the specific roles of SMCs and ICCs in live samples.

      Weaknesses:

      Since only one blocker each for gap junction and myosin II was used, the specificities of the effects were unclear.

    2. eLife assessment

      This valuable study reports the development of a novel organoid system for studying the emergence of autorhythmic gut peristaltic contractions through the interaction between interstitial cells of Cajal and smooth muscle cells. While the utility of the organoids for studying hindgut development is well illustrated by showing, for example, a previously unappreciated potential role for smooth muscle cells in regulating the firing rate of interstitial cells of Cajal, some of the functional analyses are incomplete. There are some concerns about the specificity and penetrance of perturbations and the reproducibility of the phenotypes. With these concerns properly addressed, this paper will be of interest to those studying the development and physiology of the gut.

    3. Reviewer #2 (Public Review):

      Summary:

      In this study, Yagasaki et al. describe an organoid system to study the interactions between smooth muscle cells (SMCs) and interstitial cells of Cajal (ICCs). While these interactions are essential for the control of rhythmic intestinal contractility (i.e., peristalsis), they are poorly understood, largely due to the complexity of and access to the in vivo environment and the inability to co-culture these cell types in vitro for long term under physiological conditions. The "gut contractile organoids" organoids described herein are reconstituted from stromal cells of the fetal chicken hindgut that rapidly reorganize into multilayered spheroids containing an outer layer of smooth muscle cells and an inner core of interstitial cells. The authors demonstrate that they contract cyclically and additionally use calcium imagining to show that these contractions occur concomitantly with calcium transients that initiate in the interstitial cell core and are synchronized within the organoid and between ICCs and SMCs. Furthermore, they use several pharmacological inhibitors to show that these contractions are dependent upon non-muscle myosin activity and, surprisingly, independent of gap junction activity. Finally, they develop a 3D hydrogel for the culturing of multiple organoids and found that they synchronize their contractile activities through interconnecting smooth muscle cells, suggesting that this model can be used to study the emergence of pacemaking activities. Overall, this study provides a relatively easy-to-establish organoid system that will be of use in studies examining the emergence of rhythmic peristaltic smooth muscle contractions and how these are regulated by interstitial cell interactions. However, further validation and quantification will be necessary to conclusively determine show the cellular composition of the organoids and how reproducible their behaviors are.

      Strengths:

      This work establishes a new self-organizing organoid system that can easily be generated from the muscle layers of the chick fetal hindgut to study the emergence of spontaneous smooth muscle cell contractility. A key strength of this approach is that the organoids seem to contain few cell types (though more validation is needed), namely smooth muscle cells (SMCs) and interstitial cells of Cajal (ICCs). These organoids are amenable to live imaging of calcium dynamics as well as pharmacological perturbations for functional assays, and since they are derived from developing tissues, the emergence of the interactions between cell types can be functionally studied. Thus, the gut contractile organoids represent a reductionist system to study the interactions between SMCs and ICCs in comparison to the more complex in vivo environment, which has made studying these interactions challenging.

      Weaknesses:

      The study falls short in the sense that it does not provide a rigorous amount of evidence to validate that the gut organoids are made of bona fide smooth muscle cells and ICCs. For example, only two "marker" proteins are used to support the claims of cell identity of SMCs and ICCs. At the same time, certain aspects of the data are not quantified sufficiently to appreciate the variance of organoid rhythmic contractility. For example, most contractility plots show the trace for a single organoid. This leads to a concern for how reproducible certain aspects of the organoid system (e.g. wavelength between contractions/rhythm) might be, or how these evolve uniquely over time in culture. Furthermore, while this study might be able to capture the emergence of ICC-SMC interactions as they related to muscle contraction and pacemaking, it is unclear how these interactions relate to adult gastrointestinal physiology given that the organoids are derived from fetal cells that might not be fully differentiated or might have distinct functions from the adult. Finally, despite the strength of this system, discoveries made in it will need to be validated in vivo.

    4. Reviewer #3 (Public Review):

      Summary:

      The paper presents a novel contractile gut organoid system that allows for in vitro studying of rudimentary peristaltic motions in embryonic tissues by facilitating GCaMP-live imaging of Ca2+<br /> dynamics, while highlighting the importance and sufficiency of ICC and SMC interactions in generating consistent contractions reminiscent of peristalsis. It also argues that ENS at later embryonic stages might not be necessary for coordination of peristalsis.

      Strengths:

      The manuscript by Yagasaki, Takahashi, and colleagues represents an exciting new addition to the toolkit available for studying fundamental questions in the development and physiology of the hindgut. The authors carefully lay out the protocol for generating contractile gut organoids from chick embryonic hindgut, and perform a series of experiments that illustrate the broader utility of these organoids for studying the gut. This reviewer is highly supportive of the manuscript, with only minor requests to improve confidence in the findings and broader impact of the work. These are detailed below.

      Weaknesses:

      (1) Given that the literature is conflicting on the role GAP junctions in potentiating communication between intestinal cells of Cajal (ICCs) and smooth muscle cells (SMCs), the experiments involving CBX and 18Beta-GA are well-justified. However, because neither treatment altered contractile frequency or synchronization of Ca++ transients, it would be important to demonstrate that the treatments did indeed inhibit GAP junction function as administered. This would strengthen the conclusion that GAP junctions are not required, and eliminate the alternative explanation that the treatments themselves failed to block GAP junction activity.

      (2) Given that 5uM blebbistatin increases the frequency of contractions but 10uM completely abolishes contractions, confirming that cell viability is not compromised at the higher concentration would build confidence that the phenotype results from inhibition of myosin activity. One could either assay for cell death, or perform washout experiments to test for recovery of cyclic contractions upon removal of blebbistatin. The latter may provide access to other interesting questions as well. For example, do organoids retain memory of their prior setpoint or arrive at a new firing frequency after washout?

      (3) Regulation of contractile activity was attributed to ICCs, with authors reasoning that Tuj1+ enteric neurons were only present in organoids in very small numbers (~1%). However, neuronal function is not strictly dependent on abundance, and some experimental support for the relative importance of ICCs over Tuj1+ cells would strengthen a central assumption of the work that ICCs the predominant cell type regulating organoid contraction. For example, one could envision forming organoids from embryos in which neural crest cells have been ablated via microdissection or targeted electroporation. Another approach would be ablation of Tuj1+ cells from the formed organoids via tetrodotoxin treatment. The ability of organoids to maintain rhythmic contractile activity in the total absence of Tuj1+ cells would add confidence that the ICCs are indeed the driver of contractility in these organoids.

      (4) Given the implications of a time lag between Ca++ peaks in ICCs and SMCs, it would be important to quantify this, including standard deviations, rather than showing representative plots from a single sample.

      (5) To validate the organoid as a faithful recreation of in vivo conditions, it would be helpful for authors to test some of the more exciting findings on explanted hindgut tissue. One could explant hindguts and test whether blebbistatin treatment silences peristaltic contractions as it does in organoids, or following RCAS-GCAMP infection at earlier stages, one could test the effects of GAP junction inhibitors on Ca++ transients in explanted hindguts. These would potentially serve as useful validation for the gut contractile organoid, and further emphasize the utility of studying these simplified systems for understanding more complex phenomena in vivo.

      (6) Organoid fusion experiments are very interesting. It appears that immediately after fusion, the contraction frequency is markedly reduced. Authors should comment on this, and how it changes over time following fusion. Further, is there a relationship between aggregate size and contractile frequency? There are many interesting points that could be discussed here, even if experimental investigation of these points is left to future work.

      (7) Minor: As seen in Movie 6 and Figure 6A, 5uM blebbistatin causes a remarkable increase in the frequency of contractions. Given the regular periodicity of these contractions, it is a surprising and potentially interesting finding, but authors do not comment on it. It would be helpful to note this disparity between 5 and 10 uM treatments, if not to speculate on what it means, even if it is beyond the scope of the present study to understand this further.

      (8) Minor: While ENS cells are limited in the organoid, it would be helpful to quantify the number of SMCs for comparison in Supplemental Figure S2. In several images, the number of SMCs appears quite limited as well, and the comparison would lend context and a point of reference for the data presented in Figure S2B.

      (9) Minor: additional details in the Figure 8 legend would improve interpretation of these results. For example, what is indicated in orange signal present in panels C, G and H? Is this GCAMP?

    5. Author response:

      Generals:

      We deeply appreciate the efforts by the Senior and Reviewing Editors, and also thank the three reviewers for their careful reading of the MS and their constructive comments, which are very helpful to improve our MS. We agree that we extend our efforts to elaborate the pharmacological analyses including clarification of the penetrance of GAP junction inhibitor(s), and effectiveness and specificity of the drugs. We plan to test at least L-type calcium channel blocker nifedipine. Concerning the reproducibility of the phenotypes, we indeed repeated experiments at multiple times for each of the analyses. While we demonstrated in the current version a series of representative data for simplicity along with explanation in the text that we conducted multiple times of experiments,  in a revised version we will improve the demonstration so that readers/reviewers can be convinced with the reproducibility of the data. We will also try to test other markers to look into cell types constituting the gut contractile organoid

      Specifics:

      Our provisional responses to “The weakness” raised by the reviewers are as follows:

      Reviewer #1:

      Please see the responses shown above (“Generals”).

      Reviewer #2:

      In addition to the responses in “Generals”, our response also includes the followings: We will look into wavelength between contractions/rhythm of the orgnaoid. We agree that our organoids derived from embryonic hind gut (E15) might not necessarily recapitulate the cell function in adult. However, it has well been accepted in the field of developmental biology that studies with embryonic tissue/cells make a huge contribution to unveil how complicated physiological cell function is underpinned. Nevertheless, we will carefully consider in the revised version so that the MS would not send misleading messages. Recent advances have also shown that 3D organoids can somehow “replace/substitute for” a complicated in vivo specimen when a particular cellular function is a focus of study.

      Reviewer #3:

      We appreciate a strong support of our findings.

      (1) We plan to perform positive control experiments, for example, to test if the drugs we use would interfere cardiac muscle functions.

      (2) We plan to do wach-out experiment to  confirm 10uM blebbistatin does not kill the cells. Thank you for this suggestion.

      (3) We plan to conduct tetrodotoxin treatment. Since experiments with such toxic reagents are not enouraged by our institute, we will perform experiments with a necessary-minimum amount.

      (4) We plant to address this point properly

      5) It is well predictable that blebbistatin would stop the gut movement in an explanted hindgut, and it is also well established that gut contractions (movements) are concomitant with Ca2+ transients. It would indeed be interesting to see how GJ inhibitors affect such in vivo gut movement. However, since all the reviewers and the Reviewing Editor pointed out, sensitivity (concentration) and penetrance of the drug is an important point of concern, we think that the in vivo analyses will be a next step to go in near future.

      (6) We have indeed noticed that contraction frequency is reduced after organoidal fusion. It seems as if cells communicate with each other to decide which rhythm they need to be adjusted to. Furthermore, contraction frequency tends to be slow down when the organoid becomes larger in size. It might be attributed to a delay in conductance between cells over growing distance. We plan to either quantify these potentially interesting phenomena or make a concise speculation in the revised version.

      (7)-(10) Thank you for these comments. We will fix them.

    1. Author response:

      The following is the authors’ response to the original reviews

      Public Reviews:

      Reviewer #1 (Public Review):

      The goal of Knudsen-Palmer et al. was to define a biological set of rules that dictate the differential RNAi-mediated silencing of distinct target genes, motivated by facilitating the long-term development of effective RNAi-based drugs/therapeutics. To achieve this, the authors use a combination of computational modeling and RNAi function assays to reveal several criteria for effective RNAi-mediated silencing. This work provides insights into how (1) cis-regulatory elements influence the RNAi-mediated regulation of genes; (2) it is determined that genes can "recover" from RNAi-silencing signals in an animal; and 3) pUGylation occurs exclusively downstream of the dsRNA trigger sequence, suggesting 3º siRNAs are not produced. In addition, the authors show that the speed at which RNAi-silencing is triggered does not correlate with the longevity of the silencing. These insights are significant because they suggest that if we understand the rules by which RNAi pathways effectively silence genes with different transcription/processing levels then we can design more effective synthetic RNAi-based

      therapeutics targeting endogenous genes. The conclusions of this study are mostly supported by the data, but there are some aspects that need to be clarified.

      We thank the reviewer for their kind words and for appreciating the practical utility of our approach and discoveries. 

      (1) The methods do not describe the "aged RNAi plates feeding assay" in Figure 2E. The figure legend states that "aged RNAi plates" were used to trigger weaker RNAi, but the detail explaining the experiment is insufficient. How aged is aged? If the goal was to effectively reduce the dsRNA load available to the animals, why not quantitatively titrate the dsRNA provided? Were worms previously fed on the plates, or was simply a lawn of bacteria grown until presumably the IPTG on the plate was exhausted?

      We have elaborated our methods section to describe that the plates were left at 4ºC for about 4 months before adding bacteria and performing the assay, with one possible reason for the weaker knockdown being that perhaps the IPTG in the RNAi plates is less effective. However, it is worth noting that the robustness of a feeding RNAi assay can vary from culture to culture and/or batch of plates. We therefore always perform RNAi assays with wild-type animals alongside test strains to gauge the strength of the RNAi assay for a given culture and batch of plates. We called the data in Figure 2E “weak” because of the response of wild-type animals was weak as evidenced by weak twitching in levamisole. Despite this reduced effect, we observed 100% penetrance in wild-type animals, enabling us to sensitively detect the reduced responses of the mutants. 

      (2) Is the data presented in Figure 2F completed using the "aged RNAi plates" to achieve the partial silencing of dpy-7 observed? Clarification of this point would be helpful.

      No. The only occasion when plates were older was as in response to comment 1 above.

      (3) Throughout the manuscript the authors refer to "non-dividing cells" when discussing animals' ability to recover from RNA silencing. It is not clear what the authors specifically mean with the phrase "non-dividing cells", but as this is referred to in one of their major findings, it should be clarified. Do they mean the cells are somatic cells in aged animals, thus if they are "non-dividing" the siRNA pools within the cells cannot be diluted by cell division? Based on the methods, the animals of RNAi assays were L4/Young adults that were scored over 8 days after the initial pulse of dsRNA feeding. If this is the case, wouldn't these animals be growing into gravid adults after the feeding, and thus have dividing cells as they grew?

      We thank the reviewer for highlighting the need to explain this point further. Our experiment test the silencing of the unc-22 gene, which is expressed and functions in body-wall muscle cells. Most of the body wall muscles in C. elegans are developed by the L1 stage (reviewed in Krause and Liu, 2012), and they do not divide between the L4 and adult stages. Therefore, during the duration of the experiment where we delivered a pulse of dsRNA and examined responses over days, none of these cells divide. We have added a statement in the main text to explicitly say that the recovery from silencing by dsRNA that we observed cannot be explained by dilution during cell divisions.

      (4) What are the typical expression levels/turnover of unc-22 and bli-1? Based on the results from the altered cis-regulatory regions of bli-1 and unc-22 in Figure 5, it seems like the transcription/turnover rates of each of these genes could also be used as a proof of principle for testing the model proposed in Figure 4. The strength of the model would be further increased if the RNAi sensitivity of unc-22 reflects differences in its transcription/turnover rates compared to bli-1.

      We can get a sense of the relative abundances of unc-22 and bli-1 across development from the RNA-seq experiments that have been performed by others in the field (see below). However, these data cannot be used to infer either the production or the turnover rates. Future experiments that measure production (the combined rate of transcriptional run-on, splicing, export from the nucleus, etc.) will be required to define the production rates. Similarly, assays that detect the rate of degradation of transcripts without confounding presence from continued production will be needed to establish turnover rates. Future efforts to obtain values for these in vivo rates for multiple genes will help further test the model.

      Author response image 1.

      Expression data for unc-22:

      Author response image 2.

      Expression data for bli-1:

      Reviewer #2 (Public Review):

      Summary:

      This manuscript by Knudsen-Palmer et al. describes and models the contribution of MUT-16 and RDE-10 in the silencing through RNAi by the Argonaute protein NRDE-3 or others. The authors show that MUT-16 and RDE-10 constitute an intersecting network that can be redundant or not depending on the gene being targeted by RNAi. In addition, the authors provide evidence that increasing dsRNA processing can compensate for NRDE-3 mutants. Overall, the authors provide convincing evidence to understand the factors involved in RNAi in C. elegans by using a genetic approach.

      Major Strengths:

      The author's work presents a compelling case for understanding the intricacies of RNA interference (RNAi) within the model organism Caenorhabditis elegans through a meticulous genetic approach. By harnessing genetic manipulation, they delve into the role of MUT-16 and RDE-10 in RNAi, offering a nuanced understanding of the molecular mechanisms at play in two independent case study targets (unc-22 and bli-1).

      We thank the reviewer for their kind words and for appreciating our genetic analysis.

      Major Weaknesses:

      (1) It is unclear how the molecular mechanisms of amplification are different under the MUT-16 and RDE-10 branches of the regulatory pathway, since they are clearly distinct proteins structurally. It would be interesting to do some small-RNA-seq of products generated from unc-22 and bli-1, on wild-type conditions and some of the mutants studied (eg. mut-16, rde-10 and mut16 + rde-10). That would provide some insights into whether the products of the 2 amplifications are the same in all conditions, just changing in abundance, or whether they are distinct in sequence patterns.

      As we highlight in the paper, MUT-16 and RDE-10 are indeed very different proteins. One possible hypothesis suggested by this difference is that different kinds of small RNAs are made when the underlying mechanism relies on MUT-16 versus on RDE-10. However, postulating such a difference is not necessary for explaining the data. Furthermore, since the amounts of 2º siRNAs do not have to be correlated with the strength of silencing (Figure 4E), this work raises caution against the over-reliance on small RNA sequencing for inferring gene silencing. Nevertheless, it is indeed an attractive possibility that the amounts of small RNA, their distributions along mRNA sequence, and/or the sequence biases of the accumulating small RNAs could be different when relying on MUT-16- or RDE-10-dependent mechanisms. Future work that directly examine the small RNAs that accumulate in different mutant strains after initiating RNAi can shed light on these possibilities.

      (2) In the same line, Figure 5 aims to provide insights into the sequence determinants that influence the RNAi of bli-1. It is unclear whether the changes in transcript stability dictated by the 3'UTR are the sole factor governing the preference for the MUT-16 and RDE-10 branches of the regulatory pathway. In line with the mutant jam297, it might be interesting to test whether factors like codon optimality, splicing, ... of the ORF region upstream from bli-1-dsRNA can affect its sensitivity to the MUT-16 and RDE-10 branches of the regulatory pathway.

      In Figure 5, we eliminated the possibility that any gene that is transcribed using the bli-1 promoter would require NRDE-3, and showed using jam297 that modifications to the 3’ cis regulatory regions of a target can alter the dependence on NRDE-3 for knockdown. We agree that future experiments that control individual aspects of bli-1, potentially one feature at a time, can reveal the separate contributions of each characteristic of the gene to the observed dependence on NRDE-3 of the wild-type bli-1 gene. However, given the many ways that the same level of transcript knockdown can be achieved in our modeling (Figure 4 and its supplemental figures) we expect that multiple characteristics could contribute to NRDE-3 dependence. 

      Recommendations For The Authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) On page 5, the authors state that "MUT-16 and RDE-10 are redundantly or additively required for silencing unc-22"; however, based on their data in Figure 1D, it seems nearly 100% silencing of unc-22 is achieved in single mut-16 or rde-10 mutants. If this is the case, wouldn't it suggest that redundancy of MUT-16 and RDE-10, and not an "additive effect" of MUT-16 and RDE-10 function? Although, as the mutator complex nucleates around MUT-16, the data in Figure 1D suggests it is possible that the presence of MUT-16 or RDE-10 is sufficient for the recruitment of one or more factors that triggers the silencing of unc-22, and thus only one of these factors is necessary.

      Because we are seeing 100% silencing in wild-type, mut-16(-), or rde-10(-) animals in Figure 1D, this assay (where the silencing response is strong) does not allow us to discriminate between differing levels of silencing. The “weak” RNAi assay in Figure 2E provides the opportunity to observe differences in the contributions made by MUT-16 or RDE-10, supporting the idea that the 2º siRNAs and relative contributions to silencing can indeed be additive, explaining the complete loss of silencing only in the double mutant. While MUT-16 has been shown to be required for the recruitment of other Mutators in the germline, Mutator foci are not detectable in the soma. Given that unc-22 and bli-1 are somatic targets, we are hesitant to assume a mechanism for the production of small RNAs that requires a similar MUT-16-dependent nucleation in somatic cells. MUT-16 is clearly required for full silencing. But, if it functions similarly in the soma and the germline remains an open question. Indeed the mechanism(s) for producing small RNAs in somatic cells could be different from that used for production of small RNAs in the germline because of known differences in the use of RNA-dependent RNA polymerases (e.g. Ravikumar et al., Nucleic Acids Res. 2019). Future studies that determine the subcellular localization(s) and potential biochemical function(s) of RDE-10 and MUT-16 in somatic cells are needed to further delineate mechanisms.

      (2) On page 10, "rather than one that looks a frequency" - the "a" should be "at".

      We thank the reviewer and have fixed this typo. 

      (3) Figure 4 is very crowded, further dividing 4A (right) and 4B into subpanels would help the readability of the figure.

      We thank the reviewer for identifying these figures as being particularly crowded. These panels are presented as single units because the left and right portions of each panel are intimately connected. In Fig. 4A, the outline of mechanism deduced on the left is based on experiments at various scales shown on the right. We have now clarified this in the figure legend. In Fig. 4B, the equations on the right define and use the constants depicted on the left and the definitions below apply to both parts. We have now adjusted both figure parts to make these connections clearer. 

      (4) References to the subpanels of Figure 4 in the text on page 12 are off from the figure and figure legend.

      For example:

      "Overall, τkd and tkd were uncorrelated..." refers to 4C when it should refer to 4D. "However, the maximal amount of 2ºsiRNAs..." refers to 4D when it should refer to 4E. "Additionally, an increase in transcription..." refers to 4E when it should refer to 4F.

      "When a fixed amount of dsRNA was exposed..." refers to 4F when it should refer to 4G.

      We thank the reviewer for catching these errors and we have corrected these figure references.

      Reviewer #2 (Recommendations For The Authors):

      I would encourage the authors to follow up on some of the more mechanistic comments made above, that would strengthen and complement the genetic part of the work presented.

      We agree that additional work is needed to elucidate differences in molecular mechanisms for amplifying small RNAs in an MUT-16-dependent vs. RDE-10-dependent manner. We hope to address these extensions of our work in future manuscripts that focus on the biochemistry of these proteins and the populations of small RNAs generated using them.

      I appreciate the efforts to computationally model the dynamics of the system, but I am not sure that it helps that the mathematical modelling treats both branches of the pathway as functionally equals, since they could have some mechanistic specialisation that is not yet elucidated by the current work.

      Our assumption that both branches are equivalent is the most parsimonious. If we allowed for differences, even more values for the parameters of the model will agree with experimental data. The strength of the model is that despite such conservative assumptions, it agrees with experimental data. Biochemical elaborations that make the MUT-16 and RDE-10 branches qualitatively different could exist in vivo as suggested by the reviewer. Even with such qualitative differences in detail, the overall impact on gene silencing is a quantitative and additive one as demonstrated by our experiments. Future experimental work focused on biochemistry could elucidate how a Maelstrom domain-containing protein (RDE-10) and an intrinsically disordered protein (MUT-16) act differently to ultimately promote small RNA production.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      TMC7 knockout mice were generated by the authors and the phenotype was analyzed. They found that Tmc7 is localized to Golgi and is needed for acrosome biogenesis.

      Strengths:

      The phenotype of infertility is clear, and the results of TMC7 localization and the failed acrosome formation are highly reliable. In this respect, they made a significant discovery regarding spermatogenesis.

      In the original version, I pointed out the gap between their pH/calcium imaging data and the hypothesis of ion channel function of TMC7 in the Golgi. Now the author agrees and has changed the description to be reasonable. Additional experiments were also performed, and I can say that they have answered my concern adequately.

      I would say it is good to add any presumed mechanism for the observed changes in pH and calcium concentration in the cytoplasm this time.

      We appreciate your positive comments on our revised manuscript.

      Reviewer #2 (Public Review):

      Summary:

      This study presents a significant finding that enhances our understanding of spermatogenesis. TMC7 belongs to a family of transmembrane channel-like proteins (TMC1-8), primarily known for their role in the ear. Mutations to TMC1/2 are linked to deafness in humans and mice and were originally characterized as auditory mechanosensitive ion channels. However, the function of the other TMC family members remains poorly characterized. In this study, the authors begin to elucidate the function of TMC7 in acrosome biogenesis during spermatogenesis. Through analysis of transcriptomics datasets, they identify TMC7 as a transmembrane channel-like protein with elevated transcript levels in round spermatids in both mouse and human testis. They then generate Tmc7-/- mice and find that male mice exhibit smaller testes and complete infertility. Examination of different developmental stages reveals spermatogenesis defects, including reduced sperm count, elongated spermatids, and large vacuoles. Additionally, abnormal acrosome morphology is observed beginning at the early-stage Golgi phase, indicating TMC7's involvement in proacrosomal vesicle trafficking and fusion. They observed localization of TMC7 in the cis-Golgi and suggest that its presence is required for maintaining Golgi integrity, with Tmc7-/- leading to reduced intracellular Ca2+, elevated pH, and increased ROS levels, likely resulting in spermatid apoptosis. Overall, the work delineates a new function of TMC7 in spermatogenesis and the authors suggest that its ion channel activity is likely important for Golgi homeostasis. This work is of significant interest to the community and is of high quality.

      Strengths:

      The biggest strength of the paper is the phenotypic characterization of the TMC7-/- mouse model, which has clear acrosome biogenesis/spermatogenesis defects. This is the main claim of the paper and it is supported by the data that are presented.

      Weaknesses:

      The claim is that TMC7 functions as an ion channel. It is reasonable to assume this given what has been previously published on the more well-characterized TMCs (TMC1/2), but the data supporting this is preliminary here, and more needs to be done to solidify this hypothesis. The authors are careful in their interpretation and present this merely as a hypothesis supporting this idea.

      We appreciate this constructive suggestion.

      Reviewer #3 (Public Review):

      Summary:

      In this study, Wang et al. have demonstrated that TMC7, a testis-enriched multipass transmembrane protein, is essential for male reproduction in mice. Tmc7 KO male mice are sterile due to reduced sperm count and abnormal sperm morphology. TMC7 co-localizes with GM130, a cis-Golgi marker, in round spermatids. The absence of TMC7 results in reduced levels of Golgi proteins, elevated abundance of ER stress markers, as well as changes of Ca2+ and pH levels in the KO testis. However, further confirmation is required because the analyses were performed with whole testis samples in spite of the differences in the germ cell composition in WT and KO testis. In addition, the causal relationships between the reported anomalies await thorough interrogation

      Strengths:

      By using PD21 testes, the revised assays have consolidated that depletion of TMC7 leads to a reduced level of Ca2+ and an elevated level of ROS in the male germ cells. The immunohistochemistry analyses have clearly indicated the reduced abundance of GM130, P115, and GRASP65 in the knockout testis.

      Weaknesses:

      The Discussion section contains sentences reiterating the Introduction and Results of this manuscript (e.g., Lines 79-85 and 231-236; Lines 175-179 and 259-263). Those read repetitive and can be removed.

      We thank the reviewer for this import comment. We have modified the text according to your suggestion.

      Future studies are required to decipher how TMC7 stabilizes Golgi structure, coordinates vesicle transport, and maintains the germ cell homeostasis.

      Thanks. We appreciate this constructive suggestion. We totally agree the reviewer that future studies are required to decipher how TMC7 stabilizes Golgi structure, coordinates vesicle transport, and maintains the germ cell homeostasis.

      Recommendations for the authors

      Reviewer #1 (Recommendations For The Authors):

      1. In Fig S6d, the bar of Tmc7-/- is broken in the middle for P-EIF2.

      Thanks. We have remade Fig S6d according to your suggestion in the revised manuscript.

      Reviewer #2 (Recommendations For The Authors):

      None. The reviewers have adequately answered my points. Many thanks!

      We thank the reviewer for accepting our revisions as sufficient.

      Reviewer #3 (Recommendations For The Authors):

      In the revised manuscript, the authors have addressed most of my concerns.

      We are pleased that we were able to adequately address the reviewer’s concerns. We appreciate your suggestions to further improve our study.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1

      Summary:

      In this paper, the authors performed molecular dynamics (MD) simulations to investigate the molecular basis of the association of alpha-synuclein chains under molecular crowding and salt conditions. Aggregation of alpha-synuclein is linked to the pathogenesis of Parkinson's disease, and the liquid-liquid phase separation (LLPS) is considered to play an important role in the nucleation step of the alpha-synuclein aggregation. This paper re-tuned the Martini3 coarse-grained force field parameters, which allows long-timescale MD simulations of intrinsically disordered proteins with explicit solvent under diverse environmental perturbation. Their MD simulations showed that alpha-synuclein does not have a high LLPS-forming propensity, but the molecular crowding and salt addition tend to enhance the tendency of droplet formation and therefore modulate the alpha-synuclein aggregation. The MD simulation results also revealed important intra- and inter-molecule conformational features of the alpha-synuclein chains in the formed droplets and the key interactions responsible for the stability of the droplets. These MD simulation data add biophysical insights into the molecular mechanism underlying the association of alpha-synuclein chains, which is important for understanding the pathogenesis of Parkinson's disease.

      Strengths:

      (1) The re-parameterized Martini 3 coarse-grained force field enables the large-scale MD simulations of the intrinsically disordered proteins with explicit solvent, which will be useful for a more realistic description of the molecular basis of LLPS.

      (2) This paper showed that molecular crowding and salt contribute to the modulation of the LLPS through different means. The molecular crowding minimally affects surface tension, but adding salt increases surface tension. It is also interesting to show that the aggregation pathway involves the disruption of the intra-chain interactions arising from C-terminal regions, which potentially facilitates the formation of inter-chain interactions.

      We thank the reviewer for pointing out the strengths of our study.

      Weaknesses:

      (1) Although the authors emphasized the advantage of the Martini3 force field for its explicit description of solvent, the whole paper did not discuss the water's role in the aggregation and LLPS.

      We thank the reviewer for pointing this out. We agree that we have not explored or discussed the role of water in aS aggregation or LLPS. We would like to convey that we would like to explore that in detail in a separate study altogether. However we have updated the “Discussion” section with the following lines to convey to the readers the importance water plays in aggregation and LLPS of aS.

      Page 24: “The significance of the solvent in alpha-synuclein (αS) aggregation remains underexplored. Recent studies [26, 55] underscore the pivotal role of water as a solvent in LLPS. It suggests that comprehending the solvent’s role, particularly water, is essential for attaining a deeper grasp of the thermodynamic and physical aspects of αS LLPS and aggregation. By delving into the solvent’s contribution, researchers can uncover additional factors influencing αS aggregation. Such insights hold the potential to advance our comprehension of protein aggregation phenomena, crucial for devising strategies to address diseases linked to protein misfolding and aggregation, notably Parkinson’s disease. Future investigations focusing on elucidating the interplay between αS, solvent (especially water), and other environmental elements could yield valuable insights into the mechanisms underlying LLPS and aggregation. Ultimately, this could aid in the development of therapeutic interventions or preventive measures for Parkinson’s and related diseases.”

      (2) This paper discussed the effects of crowders and salt on the surface tension of the droplets.

      The calculation of the surface tension relies on the droplet shape. However, for the formed clusters in the MD simulations, the typical size is <10, which may be too small to rigorously define the droplet shape. As shown in previous work cited by this paper [Benayad et al., J. Chem. Theory Comput. 2021, 17, 525−537], the calculated surface tension becomes stable when the chain number is larger than 100.

      We appreciate the insightful feedback from the reviewer. However, we would like to emphasize that the αS droplets exhibit a highly liquid-like behavior, characterized by frequent exchanges of chains between the dense and dilute phases, alongside a slow aggregation process. In the study by Benayad et al. (2020, JCTC) [ref. 30], FUS-LCD was the protein of choice at concentrations in the (mM) range. FUS-LCD is known to undergo very rapid LLPS at concentrations lower than 100 (μM) where for αS the critical concentration for LLPS is 500 (μM) and undergoes slower aggregation than FUS. Moreover, the diffusion constant of αS inside newly formed droplets (no liquid to solid phase transition has occurred) has been estimated to be 0.23-0.58 μm2/s (Ray et al, 2020, Nat. Comm.). The value of diffusion constant for FUS-LCD inside LLPS droplets has been estimated to be 0.17 μm2/s (Murthy et al. 2023, Nat. Struct. and Mol. Biol.). These prove that αS forms droplets that are less viscous than that formed by FUS-LCD. This dynamic nature impedes the formation of large droplets in the simulations, making it challenging to rigorously calculate surface tension from interfacial width, which, in turn, necessitates the computation of g(r) between water and the droplet.

      Furthermore, it's essential to note that our primary aim in calculating surface tension was not to determine its absolute value. Rather, we aimed to compare surface tensions obtained for the three distinct environments explored in this study. Hence, our primary objective is to compare the distributions of surface tensions rather than focusing solely on the mean values obtained. The distributions shown in Figure 4a clearly show a trend which we have stated in the article.

      (3) In this work, the Martini 3 force field was modified by rescaling the LJ parameters \epsilon and \sigma with a common factor \lambda. It has not been very clearly described in the manuscript why these two different parameters can be rescaled by a common factor and why it is necessary to separately tune these two parameters, instead of just tuning the coefficient \epsilon as did in a previous work [Larsen et al., PLoS Comput Biol 16: e1007870].

      We thank the reviewer for the comment. We think that the distance of the first hydration layer also should have an impact on aggregation/LLPS. Here we are scaling both the epsilon and sigma. A higher epsilon of water-protein interactions mean higher the energy required for removal of water molecules (dehydration) when a chain goes from the dilute to the dense phase. A higher sigma on the other hand means that the hydration shell will also be at a larger distance making dehydration easier. Moreover, tuning both (either by same or different parameter) required a change of the overall protein-water interaction by only 1%, thereby requiring only considerably minimal change in forcefield parameters (compared to the case where only epsilon is being tuned which required 6-10% change in epsilon from its original values.) . Thus we think one of the ways of tuning water-protein interactions which requires minimal retuning of Martini 3 is by optimizing both epsilon and sigma. However whether a single scaling parameter is good enough requires further exploration and is outside the scope of the current study. More importantly it would introduce another free parameter into the system and the lesser the number of free parameters, the better. For this study, a single parameter sufficed as depicted in Figure 9. To inform the readers of why we chose to scale both sigma and epsilon, we have added the following in the main text:

      Page 25-26: “Increasing the ϵ value of water-protein interactions results in a higher energy demand for removing water molecules (dehydration) as a chain transitions from the dilute to the dense phase. Conversely, a higher σ value implies that the hydration shell will be at a greater distance, facilitating dehydration if a chain moves into the dilute phase. Therefore, adjusting water-protein interactions based on the protein’s single-chain behavior may not significantly influence the protein’s phase behavior. Furthermore, fine-tuning both ϵ and σ parameters only requires a minimal change in the overall protein-water interaction (1%). As a result, this adjustment minimally alters the force field parameters.”

      (4) Both the sizes and volume fractions of the crowders can affect the protein association. It will be interesting to perform MD simulations by adding crowders with various sizes and volume fractions. In addition, in this work, the crowders were modelled by fullerenes, which contribute to protein aggregation mainly by entropic means as discussed in the manuscript. It is not very clear how the crowder effect is sensitive to the chemical nature of the crowders (e.g., inert crowders with excluded volume effect or crowders with non-specific attractive interactions with proteins, etc) and therefore the force field parameters.

      We thank the reviewer for a potential future direction. In this investigation our main focus was to simulate the inertness features of crowders only, to ensure that only entropic effect of the crowders are explored. Although this study focuses on the factors that enable aS to form an aggregates/LLPS under different environmental conditions, it would be interesting to explore in a systematic way the mechanism of action of crowders of varying shapes, sizes and interactions. Therefore we added the following lines in the “Discussion” section to let the readers know that this is also a future prospect of investigation.

      Page 22: “Under physiological conditions, crowding effects emerge prominently. While crowders are commonly perceived to be inert, as has been considered in this investigation, the morphology, dimensions, and chemical interactions of crowding agents with αS in both dilute and dense phases may potentially exert considerable influence on its LLPS. Hence, a comprehensive understanding through systematic exploration is another avenue that warrants extensive investigation.”

      Reviewer #1 (Recommendations For The Authors):

      (1) Figure S1. The title of the figure and the description in the figure caption are inconsistent?

      We thank the reviewer for the comment and we have updated the article with the correct caption.

      (2) Page 14, line 3, the authors may want to provide more descriptions of the "ms1", "ms2", and "ms3" for better understanding.

      We are grateful to the reviewer for pointing this out. We have added a line describing in brief what “ms1”, “ms2” and “ms3” represent. It reads “Subsequent to the investigation, we utilize three representative conformations, each corresponding to one of the macrostates. We designate these macrostates as 1 (ms1), 2 (ms2), and 3 (ms3) (Figure S7)” (Page 28)

      (3) Page 20, the authors may want to briefly explain how the normalized Shannon entropy was calculated.

      We thank the reviewer for pointing this out. This is plain Shannon Entropy and the word “normalized” should not have been there. To avoid confusion we have provided the equation we have used to calculate the Shannon entropy (Eq 8) (Page 21).

      Reviewer #2 (Public Review):

      In the manuscript "Modulation of α-Synuclein Aggregation Amid Diverse Environmental Perturbation", Wasim et al describe coarse-grained molecular dynamics (cgMD) simulations of α-Synuclein (αS) at several concentrations and in the presence of molecular crowding agents or high salt. They begin by bench-marking their cgMD against all-atom simulations by Shaw. They then carry 2.4-4.3 µs cgMD simulations under the above-noted conditions and analyze the data in terms of protein structure, interaction network analysis, and extrapolated fluid mechanics properties. This is an interesting study because a molecular scale understanding of protein droplets is currently lacking, but I have a number of concerns about how it is currently executed and presented.

      We thank the reviewer for finding our study interesting.

      (1) It is not clear whether the simulations have reached a steady state. If they have not, it invalidates many of their analysis methods and conclusions.

      We have used the last 1 μs (1.5-2.5 1 μs) from each simulation for further analysis in this study. To understand whether the simulations have reached steady state or not, we plot the time profile of the concentration of the protein in the dilute phase for all three cases.

      Author response image 1.

      Except for the scenario of only αS (Figures a and b), the rest show very steady concentrations across various sections of the trajectory (Figures c-f). The larger sudden fluctuations observed inFigures a and b are due to the fact that only αS undergo very slow spontaneous aggregation and owing to the fact that the dense phase itself is very fluxional, addition/removal of a few chains to/from the dense to dilute phase register themselves as large fluctuations in the protein concentration in the dilute phase. For the other two scenarios (Figures c-f) aggregation has been accelerated due to the presence of crowders/salt. This causes larger aggregates to be formed. Therefore addition/removal of one or two chains does not significantly affect the concentration and we do not see such sudden large jumps. In summary, the large jumps seen in Figures a and b are due to slow, fluxional aggregation of pure αS and finite size effects. However as these still are only fluctuations, we posit that the systems have reached steady states. This claim is further supported by the following figure where the time profile of a few useful system wide macroscopic properties show no change between 1.5-2.5 µs.

      We also have added a brief discussion in the Methods section (Page 29-30) with these figures in the Supplementary Information.

      Author response image 2.

      “In this study, we utilized the final 1 µs from each simulation for further analysis. To ascertain whether the simulations have achieved a steady state, we plotted the time profile of protein concentration in the dilute phase for all three cases. Except for minor intermittent fluctuation involving only αS in neat water (Figures S8a and S8b), the remaining cases exhibit notably stable concentrations throughout various segments of the trajectory (Figures S8 c-f). The relatively higher fluctuations observed in Figures S8a and b stem from the slow, spontaneous aggregation of αS alone, compounded by the inherently ambiguous nature of the dense phase.

      Consequently, the addition or removal of a few chains from the dense to the dilute phase results in significant fluctuations in protein concentration within the dilute phase. Conversely, in the other two scenarios (Figures S8c-f), aggregation is expedited by the presence of crowders/salt, leading to the formation of larger aggregates. Consequently, the addition or removal of one or two chains has negligible impact on concentration, thereby mitigating sudden large jumps. In summary, the conspicuous jumps depicted in Figures S8a and b arise from the gradual, fluctuating aggregation of pure αS and finite size effects. However, since these remain within the realm of fluctuations, we assert that the systems have indeed reached steady states. This assertion is bolstered by the subsequent figure, where the time profile of several pertinent system-wide macroscopic properties reveals no discernible change between 1.5-2.5 µs (Figures S9).”

      (2) The benchmarking used to validate their cgMD methods is very minimal and fails to utilize a large amount of available all-atom simulation and experimental data.

      We disagree with the reviewer on this point. We have cited multiple previous studies [26, 27] that have chosen Rg as a metric of choice for benchmarking coarse-grained model and have used a reference (experimental or otherwise) to tune Martini force fields. Majority of the notable literature where Rg was used as a benchmark during generation of new coarse-grained force fields are works by Dignon et al. (PLoS Comp. Biol.) [ref. 25], Regy et al (Protein Science. 2021) [ref. 26], Joseph et al.(Nature Computational Science. 2021) [ref. 27] and Tesei et al (Open Research Europe, 2022) [ref. 28]. From a polymer physics perspective, tuning water-protein interactions is simply changing the solvent characteristics for the biopolymer and Rg has been generally considered a suitable metric in the case of coarse-grained model. Moreover we try to match the distribution of the Rg rather than only the mean value. This suggests that at a single molecule level, the cgMD simulations at the optimum water of water-protein interactions would allow the protein to sample the conformations present in the reference ensemble. We use the extensively sampled 70 μs all-atom data from DE Shaw Research to obtain the reference Rg distribution. Also we perform a cross validation by comparing the fraction of bound states in all-atom and cgMD dimer simulations which also seem to corroborate well with each other at optimum water-protein interactions. To let the readers understand the rationale behind choosing Rg we have added a section in the Methods section (Page 25) that explains why Rg is plausibly a good metric for tuning water-protein interactions in Martini 3, at least when dealing with IDPs.

      Our optimized model is further supported by the FRET experiments by Ray et al. [6]. They found that interchain NAC-NAC interactions drive LLPS. Residue level contact maps obtained from our simulations also show decreased intrachain NAC-NAC interactions with an increased interchain NAC-NAC interactions inside the droplet. This corroborates well with the experimental observations and furthermore validates the metrics we have used for optimization of the water-protein interactions. However the comparison with the FRET data by Ray et al. was not present earlier and we have added the following lines in the updated draft.

      Page17: “Thus we observed that increased inter-chain NAC-NAC regions facilitate the formation of αS droplets which also have previously been seen from FRET experiments on αS LLPS

      droplets[6].”

      (3) They also miss opportunities to compare their simulations to experimental data on aSyn protein droplets.

      We thank the reviewer for pointing this out. We have tried to compare the results from our simulations to existing experimental FRET data on αS. Please see the previous response where we have described our comparison with FRET observations.

      (4) Aspects such as network analysis are not contextualized by comparison to other protein condensed phases.

      For a proper comparison between other protein condensed phases, we would require the position phase space of such condensates which is not readily available. Therefore we tried to explain it in a simpler manner to paint a picture of how αS forms an interconnecting network inside the droplet phase.

      (5) Data are not made available, which is an emerging standard in the field.

      We thank the reviewer for mentioning this. We have provided the trajectories between 1.5-2.5 μs, which we used for the analysis presented in the article, via a zenodo repository along with other relevant files related to the simulations (https://zenodo.org/records/10926368).

      Firstly, it is not clear that these systems are equilibrated or at a steady state (since protein droplets are not really equilibrium systems). The authors do not present any data showing time courses that indicate the system to be reaching a steady state. This is problematic for several of their data analysis procedures, but particularly in determining free energy of transfer between the condensed and dilute phases based on partitioning.

      We have addressed this concern as stated previously in the response. We have updated the article accordingly.

      Secondly, the benchmarking that they perform against the 73 µs all-atom simulation of aSyn monomer by Shaw and coworkers provides only very crude validation of their cgMD models based on reproducing Rg for the monomer. The authors should make more extensive comparisons to the specific conformations observed in the DE Shaw work. Shaw makes the entire trajectory publicly available. There are also a wealth of experimental data that could be used for validation with more molecular detail. See for example, NMR and FRET data used to benchmark Monte Carlo simulations of aSyn monomer (as well as extensive comparisons to the Shaw MD trajectory) in Ferrie at al: A Unified De Novo Approach for Predicting the Structures of Ordered and Disordered Proteins, J. Phys. Chem. B 124 5538-5548 (2020)

      DOI:10.1021/acs.jpcb.0c02924

      I note that NMR measurements of aSyn in liquid droplets are available from Vendruscolo: Observation of an α-synuclein liquid droplet state and its maturation into Lewy body-like assemblies, Journal of Molecular Cell Biology, Volume 13, Issue 4, April 2021, Pages 282-294, https://doi.org/10.1093/jmcb/mjaa075.

      In addition, there are FRET studies by Maji: Spectrally Resolved FRET Microscopy of α-Synuclein Phase-Separated Liquid Droplets, Methods Mol Biol 2023:2551:425-447. doi: 10.1007/978-1-0716-2597-2_27.

      So the authors are missing opportunities to better validate the simulations and place their structural understanding in greater context. This is just based on my own quick search, so I am sure that additional and possibly better experimental comparisons can be found.

      We have performed a comparison with existing FRET measurements by Ray et al. (2020) as discussed in a previous response and also updated the same in the article. The doi (10.1007/978-1-0716-2597-2_27) provided by the reviewer is however for a book on Methods to characterize protein aggregates and does not contain any information regarding the observations from FRET experiments. The other doi (https://doi.org/10.1093/jmcb/mjaa075) for the article from Vendrusculo group does not contain information directly relevant to this study. Moreover NMR measurements cannot be predicted from cgMD since full atomic resolution is lost upon coarse-graining of the protein . A past literature survey by the authors found very little scientific literature on molecular level characterization of αS LLPS droplets.

      Thirdly, the small word network analysis is interesting, but hard to contextualize. For instance, the 8 Å cutoff used seems arbitrary. How does changing the cutoff affect the value of S determined? Also, how does the value of S compare to other condensed phases like crystal packing or amyloid forms of aSyn?

      The 8 Å cutoff is actually arbitrary since a distance based clustering always requires a cutoff which is empirically decided. However 8 Å is quite large compared to other cutoffs used for distance based clustering. For example in ref 26, 5 Å was used as a cutoff for calculation of protein clusters. Larger cutoffs will lead to sparser network structures. However we used the same cutoff for all distance based clustering which makes the networks obtained comparable. We wanted to perform a comparison among the networks formed by αS under different environmental conditions.

      Fourthly, I see no statement on data availability. The emerging standard in the computational field is to make all data publicly available through Github or some similar mechanism.

      We thank the reviewer for pointing this out and we have provided the raw data between 1.5-2.5 μs for each scenario along with other relevant files via a zenodo repository (https://zenodo.org/records/10926368).

      Finally, on page 16, they discuss the interactions of aSyn(95-110), but the sequence that they give is too long (seeming to contain repeated characters, but also not accurate). aSyn(95-110) = VKKDQLGKNEEGAPQE. Presumably this is just a typo, but potentially raises concerns about the simulations (since without available data, one cannot check that the sequence is accurate) and data analysis elsewhere.

      This indeed is a typographical error. We have updated the article with the correct sequence. The validity of the simulations can be verified from the data we have shared via the zenodo repository (https://zenodo.org/records/10926368).

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review): 

      Summary:

      In this manuscript, Fister et. al. investigate how amputational and burn wounds affect sensory axonal damage and regeneration in a zebrafish model system. The authors discovered that burn injury results in increased peripheral axon damage and impaired regeneration. Convincing experiments show altered axonal morphology and increased Ca2+ fluxes as a result of burn damage. Further experimental proof supports that early removal of the burnt tissue by amputation rescues axonal damage. Burn damage was also shown to markedly increase keratinocyte migration and increase localized ROS production as measured by the dye Pfbsf. These responses could be inhibited by Arp 2/3 inhibition and isotonic treatment. 

      Strengths: 

      The authors use state-of-the-art methods to study and compare transection and burn-induced tissue damage. Multiple experimental approaches (morphology, Ca2+ fluxing, cell membrane labeling) confirm axonal damage and impaired regeneration time. Furthermore, the results are also accompanied by functional response tests of touch sensitivity. This is the first study to extend the role of tissue-damage-related osmotic exposure beyond wound closure and leukocyte migration to a novel layer of pathology: axonal damage and regeneration. 

      Weaknesses: 

      The conclusions of the paper claiming a link between burn-induced epithelial cell migration, spatial redox signaling, and sensory axon regeneration are mainly based on correlative observations. Arp 2/3 inhibition impairs cell migration but has no significant effect on axon regeneration and restoration of touch sensitivity. 

      We agree with the reviewer. We have tried many experiments to address this question. The data show that Arp 2/3 inhibition with CK666 is an effective way to inhibit initial keratinocyte migration. However, later migration still proceeds. What is interesting is that just inhibition of the early migration is sufficient to restore localized ROS production in the wound area in the first  hour post-burn, even if this is not sufficient to prevent ROS accumulation over time. There is also a trend toward improved sensory neuron function late after this early treatment. However, this is not statistically significant. We think it is likely that both migration and tissue scale ROS influence the regeneration defect of sensory neurons after burn. The data using isotonic solution supports this conclusion. We have tried many other ways to limit keratinocyte migration including depletion of talin and expression of a dominant negative Rac in basal epithelial cells, but these treatments were not compatible with survival of the fish after burn.

      Pharmacological or genetic approaches should be used to prove the role of ROS production by directly targeting the known H2O2 source in the system: DUOX. 

      We agree that pharmacologic or genetic approaches to directly manipulate ROS production would provide substantial support to the hypothesis that ROS, along with keratinocyte migration, is a main factor contributing to poor burn outcomes. To address this, we first tried using a morpholino to deplete DUOX. However, the combination of DUOX morpholino and burn injury was lethal to larvae. We also used pharmacologic inhibition of ROS production using DPI (Diphenyleneiodonium). With this treatment, ROS is inhibited for only the first hour post-burn as treatment is lethal for longer periods of time. Burned larvae have marginally improved axon density and touch sensitivity, suggesting the importance of ROS in burn outcomes, however it was not statistically significant. It is likely that an increased effect would be observed with longer treatment, but treatment for more than 1 hour was toxic. We have added a supplemental figure with this new DPI data.

      While the authors provide clear and compelling proof that osmotic responses lie at the heart of the burn-induced axonal damage responses, they did not consider the option of further exploring any biology related to osmotic cell swelling. Could osmotic ATP release maybe play a role through excitotoxicity? Could cPLA2 activation-dependent eicosanoid production relate to the process? Pharmacological tests using purinergic receptor inhibition or blockage of eicosanoid production could answer these questions. 

      We agree that the role of osmotic cell swelling in the burn response is an interesting avenue for future study. However, we make use of isotonic treatment in this study specifically for its effect on keratinocyte migration and broad-scale wound healing. As a result, we feel that pursuing the biology of this swelling phenomenon is outside the scope of this paper.

      The authors provide elegant experiments showing that early removal of the burnt tissue can rescue damage-induced axonal damage, which could also be interpreted in an osmotic manner: tail fin transections could close faster than burn wounds, allowing for lower hypotonic exposure time. Axonal damage and slow regeneration in tail fin burn wounds could be a direct consequence of extended exposure time to hypotonic water. 

      We have done experiments using FM dye to test how long it takes burn and transection wounds to close (shown below). In these experiments, dye entry into wounded tissue is used as a readout of wound closure. Dye is only able to enter wounded tissue when the epithelial barrier is disrupted. Our data reveal that transections take approximately 10 minutes to fully close, while burns take approximately 20 minutes to close.

      Author response image 1.

      To test if this difference in wound closure time would have an effect on axon outcomes, we repeated, but slightly modified, the dual-wound experiment. We increased the amount of time the burn condition was exposed to hypotonic conditions by 10 additional minutes (by transecting burned tissue at 15 minutes post burn, shortly before closure) and compared axon outcomes to the 5 mpw control transection. These results show there was no difference in axon regeneration or function when secondary transection was performed at 5 or 15 minutes post burn, suggesting that increased exposure to hypotonic solution is not the reason for defects in axon outcomes after burn injury.

      Author response image 2.

      Reviewer #2 (Public Review): 

      This is an interesting study in which the authors show that a thermal injury leads to extensive sensory axon damage and impaired regrowth compared to a mechanical transection injury. This correlates with increased keratinocyte migration. That migration is inhibited by CK666 drug treatment and isotonic medium. Both restrict ROS signalling to the wound edge. In addition, the isotonic medium also rescues the regrowth of sensory axons and recovery of sensory function. The findings may have implications for understanding non-optimal re-innervation of burn wounds in mammals. 

      The interpretation of results is generally cautious and controls are robust. 

      Here are some suggestions for additional discussion: 

      The study compares burn injury which produces a diffuse injury to a mechanical cut injury which produces focal damage. It would help the reader to give a definition of wound edge in the burn situation. Is the thermally injured tissue completely dead and is resorbed or do axons have to grow into damaged tissue? The two-cut model suggests the latter. Also giving timescales would help, e.g. when do axons grow in relation to keratinocyte movement? An introductory cartoon might help. 

      We thank the reviewer for these insightful comments and questions. The burn wound is defined as the area that is directly damaged as a result of increased heat (labeled by FM dye entry), and the burn wound edge as the first line of healthy cells adjacent to the burned cells. These definitions have been added to the text to clarify the areas referenced. Recent experiments lead us to believe the wound area is composed almost completely of dead cells, but we are currently working to discover the fate of these dead cells as well as the wound adjacent cells that migrate to the wound edge after burn. As a result, we do not know whether axons grow into damaged tissue or if the damaged tissue is extruded, but we do see growth cone formation within a few hours after wounding suggesting the axons are actively trying to regenerate after a burn.

      Could treatment with CK666 or isotonic solution influence sensory axons directly, or through other non-keratinocyte cell types, such as immune cells? 

      We have done experiments looking at the density of caudal fin innervation in CK666, isotonic, or DPI treated fins. The axon density is unchanged in all these treatments compared to control treated larvae, so we do not believe these treatments affect axon health homeostatically. These data have been added to supplemental figure 3. Additionally, one of the benefits of the larval zebrafish burn model is the simplicity of the system – the epidermis is primarily composed of sensory axons, mesenchymal cells and keratinocytes. The burn environment is proinflammatory so it does promote immune cell recruitment, but we do not believe the immune cells are interacting directly with sensory axons besides clearing axonal debris. Previous papers by our lab have shown that peak immune cell recruitment occurs at 6 hpw, but they localize to the damaged tissue in the burn area and not the wound edge.

      Reviewer #3 (Public Review): 

      Fister and colleagues use regeneration of the larval zebrafish caudal fin to compare the effects of two modes of tissue damage-transection and burn-on cutaneous sensory axon regeneration. The authors found that restoration of sensory axon density and function is delayed following burn injury compared to transection. 

      The authors hypothesized that thermal injury triggers signals within the wound microenvironment that impair sensory neuron regeneration. The authors identify differences in the responses of epithelial keratinocytes to the two modes of injury: keratinocytes migrate in response to burn but not transection. Inhibiting keratinocyte migration with the small-molecule inhibitor of Arp2/3 (CK666) resulted in decreased production of reactive oxygen species (ROS) at early, but not late, time points. Preventing keratinocyte migration by wounding in isotonic media resulted in increased sensory function 24 hours after burn. 

      Strengths of the study include the beautiful imaging and rigorous statistical approaches used by the authors. The ability to assess both axon density and axon function during regeneration is quite powerful. The touch assay adds a unique component to the paper and strengthens the argument that burns are more damaging to sensory structures and that different treatments help to ameliorate this. 

      A weakness of the study is the lack of genetic and cell-autonomous manipulations. Additional comparisons between transection and burns, in particular with manipulations that specifically modulate ROS generation or cell migration without potentially confounding effects on other cell types or processes would help to strengthen the manuscript.

      The use of genetic and cell-autonomous approaches would strengthen our study, however, we were unable to do this due to the lethality of these genetic approaches (or cell autonomous approaches). Basal epithelial migration is necessary for embryonic development. We attempted to circumvent this by generation of larvae transiently expressing a dominant-negative form of Rac, a protein crucial to the migratory process. The chimeric expression of the dominant negative Rac was either damaging to the larvae or the mosaicism was too low to observe any effects on migration phenotype.

      We also attempted a genetic approach to manipulate ROS production, as discussed above. We found that the DUOX morpholino was lethal to burned larvae. Finally, we attempted pharmacological inhibition of ROS production using the inhibitor DPI (Diphenyleneiodonium). With this treatment, burned larvae have marginally improved axon density and touch sensitivity, suggesting that dampening ROS may improve outcome. The DPI data have been added to the manuscript.

      In terms of framing their results, the authors refer to "sensory neurons" and "sensory axons" throughout the text - it should be made clear what type of neuron(s)/axon(s) are being visualized/assayed. Along these lines, a broader discussion of how burn injuries affect sensory function in other systems - and how the authors' results might inform our understanding of these injury responses - would be beneficial to the reader. 

      In summary, the authors have established a tractable vertebrate system to investigate different sensory axon wound healing outcomes in vivo that may ultimately allow for the identification of improved treatment strategies for human burn patients. Although the study implicates differences in keratinocyte migration and associated ROS production in sensory axon wound healing outcomes, the links between these processes could be more rigorously established. 

      The inconsistency between “neuron” and “axon” has been noted and the text has been corrected accordingly. “Neuron” is used when referring to the cell as a whole, while “axon” is used when referring to the sensory processes in the caudal fin. We added information about burn in the introduction as suggested: “While epithelial tissue is well adapted to repair from mechanical damage, burn wounds heal poorly. Thermal injury results in chronic pain and lack of sensation in the affected tissue, suggesting that an abnormal sensory neuron response contributes to burn wound pathophysiology.”

      We thank the reviewer’s for their comments.

      Recommendations For The Authors:

      Reviewer #1 (Recommendations For The Authors): 

      Suggested experiments: 

      (1) ROS measurements with the dye Pfbsf should be validated with more established ROS probes such as HyPer. 

      Pfbsf has been used previously as a readout of ROS production, and its use is documented in zebrafish (Maeda et al., Angew Chem Int Ed Engl, 2004, and Niethammer et al, Nature, 2009). These sources have been added as references when introducing Pfbsf to provide context for its use. The probe was validated and compared to HyPer in Niethammer’s 2009 paper. In our hands, we have used both probes and have similar results with tail transection.

      (2) To better support claims on ROS and H2O2 playing a central role in mediating axonal damage, the authors should consider pharmacological approaches such as rescue experiments with H2O2 and experiments using inhibitors such as DPI ar apocynin. 

      While the above reagents and drugs have limitations and non-specific side effects, more convincing proof could result from genetic approaches including experiments on DOUX knockdown or knockout lines. 

      To further dissect the role of ROS in the burn response, we conducted experiments using DPI, a potent ROS inhibitor that is well-documented in the literature. We found that 20 uM treatment of DPI (1 hour pretreatment, 1 hour post-burn) marginally improved axon density when quantified 24 hpw. Any higher dose, when in combination with a burn, proved to be lethal. Longer treatment with DPI was also not tolerated.

      In addition to experiments with DPI, we attempted to burn larvae that were injected with DUOX morpholino. The combined use of burn and DUOX MO was lethal. We have dampened the conclusions and include the new data with the DPI in the revised manuscript.

      Minor corrections: 

      (1)A phrase/expression in the abstract is confusing: isotonic treatment does not "induce osmotic regulation". Cells exposed to hypo- or hypertonicity will respond by regulatory volume decrease or increase, respectively. Isotonic treatment maintains homeostasis. 

      We appreciate this point and agree with the distinction. Revisions have been made in the text accordingly.

      (2) Figures 4E and 5E would be better to show as an average of multiple experiments with statistical significance. 

      The purpose of figures 4E and 5E are to demonstrate changes in fluorescence intensity and localization of ROS using the representative time series shown in 4D and 5D. The figure legend has been updated accordingly.

      Reviewer #2 (Recommendations For The Authors): 

      Figure 3D How can one distinguish between the two cellular elements that randomly meet or that there is actual coordination? Can the interactions be quantified? It is also unclear what the authors mean by "sensory neuron movement". The authors show that the neuronal cell bodies stay in their position, so only the axons change position. Do they do this by growth, i.e. the neuronal growth cones follow the keratinocytes or do keratinocytes displace the axon shafts? 

      We have included supplemental movies that address this question in the new uploaded document. Figure 3D is comprised of still images taken from supplemental movie 2, which is a timelapse of keratinocytes/axons moving together after a burn injury.  This movie clearly shows keratinocytes and their ensheathed axons moving simultaneously, so keratinocytes are mechanically pulling sensory axon shafts with them. We have revised the text to say axon movement, not sensory neuron movement.

      Over the time course of axonal movement (1 hour post-burn), it is not possible that neuronal growth cones contribute to movement, as this is too slow – previous work by other labs has shown that it takes several hours for axons to fully regenerate into amputated tissue, with movement not even noticeable until about 3 hours post-wound (Rieger and Sagasti, PLOS Biology, 2011).

      Regarding the second point, “neuron” vs. “axon” is an inconsistency in the text that has been corrected. “Neuron” is used when referring to the cell as a whole, “axon” is used when referring to the processes that innervate the caudal fin. The axons are physically pulled along with keratinocytes as they migrate after burn application. From our observations, growth cones appear closer to the wound site after the movement has stopped.

      Figure 4G It is surprising that the visual differences in the distribution of values are not statistically significant. 

      The distribution of values in 4G was large and that is why there is no statistically-significant difference – we were also surprised at this result. We did all statistics with a statistician and this included rigorous criteria for significance.

      Figure 4H The images seem to show a difference, whereas the quantification does not. I suggest choosing more representative images. 

      Figure 4H has been updated to include a more representative image of axon patterning with CK666 treatment.

      Figure 6A The text states that axon damage in the control and isotonic condition is comparable, yet in the image, it appears that the damage in the isotonic treatment at 0 hpw is more distal. 

      This is a good observation that we consistently see in isotonic-treated fish after burn. Axon damage localizes more proximally in isotonic-treated samples because the keratinocytes distal to the notochord are likely dead, and the axons innervating those cells are likely immediately destroyed upon burn application. As a result, the distal axons are not present to express GCaMP. We believe isotonic treatment allows keratinocytes to live slightly longer, so axon damage is therefore prevented for longer. This is also the focus of continuing work to further understand the burn microenvironment.

      Finally, the materials section could mention bias mitigation measures, e.g. withholding the treatment condition from the experimenter in the touch test. 

      We minimized bias in experiments whenever possible, and the conservative statistical measures that were applied to our data further reduce the likelihood of false significance.

      Reviewer #3 (Recommendations For The Authors): 

      - Line numbers would have facilitated reviewer feedback. 

      - Supplementary movies were missing in the submission. 

      The lack of supplementary movies upon submission was a mistake and the movies have been uploaded along with the revised manuscript.

      Introduction: 

      - Pg. 3: "In response to tissue damage, sensory neurons undergo rapid and localized axonal degeneration 4,5." Not sure reference 4 (Reyes et al) is appropriate here as this study was not in the context of tissue damage. 

      We have revised this section as suggested by the reviewer.

      Results: 

      - The expected expression pattern/localization of several transgenes was unclear. Please clearly state what cell type(s) each should label. For example, pg. 5 - "We next sought to further investigate sensory neuron function in burned tissue. For this, we assessed wound-induced axonal damage using zebrafish larvae that express the calcium probe GCaMP." Where is GCaMP expressed? 

      The manuscript has been updated to include expression patterns for the included transgenes – in this mentioned case, GCaMP is expressed in neurons under the pan-neuronal Elavl3 promoter.

      - Introducing the GCaMP labeling could use some clarification. Pg. 5 - "As shown previously by other groups, GCaMP labels degenerating neurons in real time35." This is confusing. Do the authors mean that GCaMP increases immediately prior to Wallerian degeneration as shown by Vargas et al. (PMID: 26558774)? 

      Sustained elevated calcium levels are associated with axon damage. Previous work from other labs has shown that calcium influx follows axon injury (Ziv and Spira, EJN 1993, Adalbert et al., Neuroscience 2012). In these experiments, whenever there are CGaMP-positive punctae, this indicates axon damage. We have revised the manuscript to address this critique.

      The Elavl3-GCaMP5 transgenic line will label when calcium levels increase in neurons. However, given the parameters used for imaging in our study (20x magnification, 100 ms exposure, and collection speed every 30 seconds for timelapses), we believe that only sufficiently large increases in calcium that are indicative of cell damage, and not physiological function, are being visualized.

      - Figure 1E - Are these panels images of the same fish? Please specify in the legend. 

      Figure 1E is comprised of one transected and one burned larva each, live-imaged over the course of six hours. The legend has been updated to include this information.

      - Figure 1F - How was the damage area measured? Consider doing this measurement over time to match Figure 1E. 

      Axon damage area measurements were performed similar to axon density measurements – maximum intensity z-projected confocal images of the caudal fin were generated using FIJI. For all experiments, the caudal fin area posterior to the notochord was outlined using the Polygon tool and measured to obtain a total surface area ROI. Axon fragments inside the outlined area were manually thresholded so all fragments posterior to the notochord were labeled and no saturated pixels were present, and an area measurement of these thresholded pixels was taken. We have added a section describing these measurements in the Methods section under “Axon damage quantification.”

      - Pg. 5 - When introducing the ngn1 MO - please state the expected phenotype and cite the appropriate background literature_._ 

      The ngn1 morpholino was cited in the Methods section with the appropriate literature (Cornell and Eisen, Development, 2002), from which we got the morpholino sequence. We thank the reviewer for pointing out the need for more introduction and clarification in the main text, so the ngn1 morpholino has been discussed in greater depth and cited in the main text as well using the same citation.

      - The two-wound model is an elegant approach but could be more clearly described in the main text. 

      An improved explanation of the two-wound experiment has been added to the text.

      - For Figure 3, it would be helpful to have a schematic of the anatomy illustrating the relative positions of axons and epidermal cell types. 

      - Figure 3C - should an additional control here be transected? Given that the krt4:lifeact transgene labels both layers of the epidermis, how were the superficial and basal keratinocytes separated? Interpretation of this section should be carefully worded. The authors state that "...suggesting that the superficial keratinocytes are being pulled by the motile basal keratinocytes" (pg.7 ) but isn't another possibility that the superficial cells are stationary? 

      It is correct that the krt4:lifeact transgene labels both layers of keratinocytes, which together span 20-30 microns. These layers were separated from the same z-stack collected by confocal imaging. The first z-slice and last z-slice of the same stack were separated using FIJI and pseudocolored to appear as different colors. This clarification has been added to the Methods.

      Prior observations with the krt4:lifeact and krt4:utrch (figure 3A) transgenic lines reveal that both keratinocyte layers will move distally after burn application.

      - Pg. 7 - "The axons of sensory neurons are ensheathed within actin-rich channels running through basal keratinocytes 50,51." ref 51 is a C. elegans paper which does not have basal keratinocytes.

      This was in error. The correct reference has replaced reference 51 (O’brien, J Comp. Neurol., 2012), in which electron microscopy is used to document the development of two layers of epithelial cells that also ensheath sensory neurons in a protective manner similar to glial cells in the central nervous system.

      - Figures S1E and F - the authors state that RB and DRG soma don't move. However, it was unclear from the figure panels and legend whether the authors imaged neurons that actually innervate the caudal fin (rather than some other region of the animal). Please clarify. For comparison, Fig S1F needs a pre-injury image to be meaningful. 

      The imaged cell bodies were those in the posterior trunk region, which are responsible for innervating the posterior sections of the fish including the caudal fin. From our observations, there was no movement of neuronal cell bodies after the burn.

      - Figure 5 title - can the authors clarify what aspect of this figure relates to "sustained epidermal damage" 

      The figure 5 title has been updated in response to the reviewer comments.

      - Figure 6 - is touch sensitivity really "restored" as the authors suggest? Alternatively, sensitivity may never be lost in isotonic treatment. Or the loss may be delayed? 

      We have modified the text accordingly by updating our phrasing – “restored” has been replaced with “improved” to indicate benefit over time.

      - Can the authors further disentangle the effects of keratinocyte migration, ROS, and isotonic treatment on axon regeneration? For example, would the addition of CK666 to the Isotonic +1 hpw treatment improve axon regeneration? Can the authors directly manipulate ROS signaling (e.g., through exogenous addition of H2O2 or duox1 MO) to alter regeneration outcomes in their wounding assays? 

      See the comments above.

      - Figure 6 title - consider removing or clarifying the word "excessive" here 

      The title has been revised according to the reviewer suggestion.

      - hpw vs hpb were used inconsistently throughout the text 

      The manuscript has been revised to use “hpw” when referring to the timeframe after injury application.

      Methods: 

      - Zebrafish transgenics are missing allele names 

      References: 

      - Many mistakes were noted in this section e.g., journal names missing, wrong authors, typos, DOIs misformatted 

      The references section has been corrected to use formatting consistent with APA citation and eLife preferred guidelines.

    1. eLife assessment

      This important study reports the discovery of a novel nucleotide ubiquitylation activity by the DTX3L E3 ligase. Solid evidence is presented for ubiquitin attachment to single-stranded oligonucleotides. This very interesting biochemical finding can be used as a starting point for studies to establish relevance in a physiological setting.