10,000 Matching Annotations
  1. Sep 2024
    1. Reviewer #1 (Public Review):

      Greter et al. provide an interesting and creative use of lactulose as a "microbial metabolism" inducer, combined with tracking of H2 and other fermentation end products. The topic is timely and will likely be of broad interest to researchers studying nutrition, circadian rhythm, and gut microbiota. However, a couple of moderate to major concerns were noted that may impact the interpretation of the current data:

      (1) Much of the data relies on housing gnotobiotic mice in metabolic cages, but I couldn't find any details of methods to assess contamination during multiple days of housing outside of gnotobiotic isolators/cages. Given the complexity of the metabolic cage system used, sterility would likely be incredibly challenging to achieve. More details needed to be included about how potential contamination of the mice was assessed, ideally with 16S rRNA gene sequencing data of the endpoint samples and/or qPCR for total colonization levels relative to the more targeted data shown.

      (2) The language could be softened to provide a more nuanced discussion of the results. While lactulose does seem to induce microbial metabolism it also could have direct effects on the host due to its osmotic activity or other off-target effects. Thus, it seems more precise to just refer to lactulose specifically in the figure titles and relevant text. Additionally, the degree to which lactulose "disrupts the diurnal rhythm" isn't clear from the data shown, especially given that the markers of circadian rhythm rapidly recover from the perturbation. It is probably more precise to instead state that lactulose transiently induces fermentation during the light phase or something to that effect. The discussion could also be expanded to address what methods are available or could be developed to build upon the concepts here; for example, the use of genetic inducers of metabolism which may avoid the more complex responses to lactulose.

      Despite these concerns, this was still an intriguing and valuable addition to the growing literature on the interface of the microbiome and circadian fields.

    2. Reviewer #2 (Public Review):

      Summary:

      The authors aimed to investigate how microbial metabolites, such as hydrogen and short-chain fatty acids (SCFAs), influence feeding behavior and circadian gene expression in mice. Specifically, they sought to understand these effects in different microbial environments, including a reduced community model (EAM), germ-free mice, and SPF mice. The study was designed to explore the broader relationship between the gut microbiome and host circadian rhythms, an area that is not well understood. Through their experiments, the authors hoped to elucidate how microbial metabolism could impact circadian clock genes and feeding patterns, potentially revealing new mechanisms of gut microbiome-host interactions.

      Strengths:

      The manuscript presents a well-executed investigation into the complex relationship between microbial metabolites and circadian rhythms, with a particular focus on feeding behavior and gene expression in different mouse models. One of the major strengths of the work lies in its innovative use of a reduced community model (EAM) to isolate and examine the effects of specific microbial metabolites, which provides valuable insights into how these metabolites might influence host behavior and circadian regulation. The study also contributes to the broader understanding of the gut microbiome's role in circadian biology, an area that remains poorly understood. The experiments are thoughtfully designed, with a clear rationale that ties together the gut microbiome, metabolic products, and host physiological responses. The authors successfully highlight an intriguing paradox: the significant influence of microbial metabolites in the EAM model versus the lack of effect in germ-free and SPF mice, which adds depth to the ongoing exploration of microbial-host interactions. Despite some methodological concerns, the manuscript offers compelling data and opens up new avenues for research in the field of microbiome and circadian biology.

      Weaknesses:

      The manuscript, while providing valuable insights, has several methodological weaknesses that impact the overall strength of the findings. First, the process for stool collection lacks clarity, raising concerns about potential biases, such as the risk of coprophagia, which could affect the dry-to-wet weight ratio analysis and compromise the validity of these measurements. Additionally, the use of the term "circadian" in some contexts appears inaccurate, as "diurnal" might be more appropriate, especially given the uncertainty regarding whether the observed microbiome fluctuations are truly circadian. Another significant issue is the unexpected absence of an osmotic effect of lactulose in EAM mice, which contradicts the known properties of lactulose as an osmotic laxative. This finding requires further verification, including the use of a positive control, to ensure it is not artifactual. The presentation of qRT-PCR data as log2-fold changes, with a mean denominator, could introduce bias by artificially reducing variability, potentially leading to spurious findings or increased risk of Type I error. This approach may explain the unexpected activation of both the positive and negative limbs of the circadian clock. Moreover, the lack of detailed information on the primers and housekeeping genes used in the experiments is concerning, particularly given the importance of using non-circadian housekeeping genes for accurate normalization. The methods for measuring metabolic hormones, such as GLP-1 and GIP, are also not adequately described. If DPP-IV/protease inhibitor tubes were not used, the data could be unreliable due to the rapid degradation of these hormones by circulating proteases. Finally, the manuscript does not address the collection of hormone levels during both fasting and fed phases, a critical aspect for interpreting the metabolic impact of microbial metabolites. These methodological concerns collectively weaken the robustness of the study's results and warrant careful reconsideration and clarification by the authors.

      Because of these weaknesses, the authors have partially achieved their aims by providing novel insights into the relationship between microbial metabolites and host circadian rhythms. The data do suggest that microbial metabolites can significantly influence feeding behavior and circadian gene expression in specific contexts. However, the unexpected absence of an osmotic effect of lactulose, the potential biases introduced by the log2-fold change normalization in qRT-PCR data, and the lack of clarity in critical methodological details weaken the overall conclusions. While the study provides valuable contributions to understanding the gut microbiome's role in circadian biology, the methodological weaknesses prevent a full endorsement of the authors' conclusions. Addressing these issues would be necessary to strengthen the support for their findings and fully achieve the study's aims.

      Despite the methodological concerns raised, this work has the potential to make a significant impact on the field of circadian biology and microbiome research. The study's exploration of the interaction between microbial metabolites and host circadian rhythms in different microbial environments opens new avenues for understanding the complex interplay between the gut microbiome and host physiology. This research contributes to the growing body of evidence that microbial metabolites play a crucial role in regulating host behaviors and physiological processes, including feeding and circadian gene expression.

    3. Reviewer #3 (Public Review):

      Summary:

      In the manuscript by Greter, et al., entitled "Acute targeted induction of gut-microbial metabolism affects host clock genes and nocturnal feeding" the authors are attempting to demonstrate that an acute exposure to a non-nutritive disaccharide (lactulose) promotes microbial metabolism that feeds back onto the host to impact circadian networks. The premise of the study is interesting and the authors have performed several thoughtful experiments to dissect these relationships, providing valuable insights for the field. However, the work presented does not necessarily support some of the conclusions that are drawn. For instance, lactulose is administered during the fasting period to mimic the impact of a feeding bout on the gut microbiota, but it would be important to perform this treatment during the fed state as well to show that the effects on food intake, etc. do not occur. To truly draw the conclusion that the current outcomes are directly connected to and mediated via an impact on the host circadian clock, it would be ideal to perform these studies in a circadian gene knock-out animal (i.e., Cry1 or Cry2 KO mice, or perhaps Bmal-VilCre tissue-specific KO mice). If the effects are lost in these animals, this would more concretely connect the current findings to the circadian clock gene network. Despite these reservations, the work is promising.

      Strengths:

      Attempting to disentangle nutrient acquisition from microbial fermentation and its impact on diurnal dynamics of gut microbes on host circadian rhythms is an important step for providing insights into these host-microbe interactions.

      The authors utilize a novel approach in leveraging lactulose coupled with germ-free animals and metabolic cages fitted with detectors that can measure microbial byproducts of fermentation, particularly hydrogen, in real-time.

      The authors consider several interesting aspects of lactulose delivery, including how it shifts osmotic balance as well as provides calculations that attempt to explain the caloric contribution of fermentation to the animal in the context of reduced food intake. This provides interesting fundamental insights into the role of microbial outputs on host metabolism.

      Weaknesses:

      While the authors have done a large amount of work to examine the osmotic vs. metabolic influence of lactulose delivery, the authors have not accounted for the enlarged cecum and increased cecal surface area in germ-free mice. The authors could consider an additional control of cecectomy in germ-free mice.

      The authors have examined GI hormones as one possible mechanism for how food intake is altered by microbial fermentation of lactulose. However, the authors measure PYY and GLP-1 only at a single time point, stating that there are no differences between groups. Given the goal of the studies is to tie these findings back into circadian rhythms, it would be important to show if the diurnal patterns of these GI hormones are altered.

      Considerations of other factors, such as conjugated vs. deconjugated bile acids, microbial bile salt hydrolase activity, and bile acid resorption, might be an important consideration for how lactulose elicits more influence on ileal circadian clock genes relative to cecum and colon.

      Measurements of GI transit time (both whole gut and regional) would be an important for consideration for how lactulose might be impacting the ileum vs. cecum vs. colon.

    4. Author response:

      Reviewer #1 (Public Review):

      Greter et al. provide an interesting and creative use of lactulose as a "microbial metabolism" inducer, combined with tracking of H2 and other fermentation end products. The topic is timely and will likely be of broad interest to researchers studying nutrition, circadian rhythm, and gut microbiota. However, a couple of moderate to major concerns were noted that may impact the interpretation of the current data:

      (1)  Much of the data relies on housing gnotobiotic mice in metabolic cages, but I couldn't find any details of methods to assess contamination during multiple days of housing outside of gnotobiotic isolators/cages. Given the complexity of the metabolic cage system used, sterility would likely be incredibly challenging to achieve. More details needed to be included about how potential contamination of the mice was assessed, ideally with 16S rRNA gene sequencing data of the endpoint samples and/or qPCR for total colonization levels relative to the more targeted data shown.

      We thank the reviewer for pointing out that we have not made the experimental setup clear in the text. One of the unique features of our metabolic cage setup is that the mice do not need to be housed outside gnotobiotic isolators, but that the whole system is placed inside an isolator. We have developed and published this system recently (Hoces et al, PLOS Biol 2022), including extensive testing for sterility/gnotobiosis. We will improve clarity in a revised version.

      Given that 16S sequencing of germ-free mice will typically produce false positive reads, we used Blautia pseudococcoides as an indicator strain for contaminations. This strain is present in our SPF mouse colony, forms spores that are highly resilient to decontamination measures, and has been the most likely contaminant in our gnotobiotic system. We have checked for presence of this strain in the cecum content of all our animals at the end of each experiment, and only included experiments which had a B. pseudococcoides signal below threshold level.

      (2)  The language could be softened to provide a more nuanced discussion of the results. While lactulose does seem to induce microbial metabolism it also could have direct effects on the host due to its osmotic activity or other off-target effects. Thus, it seems more precise to just refer to lactulose specifically in the figure titles and relevant text. Additionally, the degree to which lactulose "disrupts the diurnal rhythm" isn't clear from the data shown, especially given that the markers of circadian rhythm rapidly recover from the perturbation. It is probably more precise to instead state that lactulose transiently induces fermentation during the light phase or something to that effect. The discussion could also be expanded to address what methods are available or could be developed to build upon the concepts here; for example, the use of genetic inducers of metabolism which may avoid the more complex responses to lactulose.

      The point about language is well taken. We tried to make the argument that what we call disruption of the diurnal rhythm is acute, meaning that it is not disrupting the rhythm "chronically" (i.e., for longer), but that it recovers rapidly from this transient disruption. Given the confusion this wording is causing we are rephrasing this in a new version of the manuscript.

      We also appreciate the mention of concepts from our study that can be built on in future studies, and we will add a paragraph on potential further research.

      Despite these concerns, this was still an intriguing and valuable addition to the growing literature on the interface of the microbiome and circadian fields.

      We thank the reviewer for all their encouraging and constructive remarks!

      Reviewer #2 (Public Review):

      Summary:

      The authors aimed to investigate how microbial metabolites, such as hydrogen and short-chain fatty acids (SCFAs), influence feeding behavior and circadian gene expression in mice.

      Specifically, they sought to understand these effects in different microbial environments, including a reduced community model (EAM), germ-free mice, and SPF mice. The study was designed to explore the broader relationship between the gut microbiome and host circadian rhythms, an area that is not well understood. Through their experiments, the authors hoped to elucidate how microbial metabolism could impact circadian clock genes and feeding patterns, potentially revealing new mechanisms of gut microbiome-host interactions.

      Strengths:

      The manuscript presents a well-executed investigation into the complex relationship between microbial metabolites and circadian rhythms, with a particular focus on feeding behavior and gene expression in different mouse models. One of the major strengths of the work lies in its innovative use of a reduced community model (EAM) to isolate and examine the effects of specific microbial metabolites, which provides valuable insights into how these metabolites might influence host behavior and circadian regulation. The study also contributes to the broader understanding of the gut microbiome's role in circadian biology, an area that remains poorly understood. The experiments are thoughtfully designed, with a clear rationale that ties together the gut microbiome, metabolic products, and host physiological responses. The authors successfully highlight an intriguing paradox: the significant influence of microbial metabolites in the EAM model versus the lack of effect in germ-free and SPF mice, which adds depth to the ongoing exploration of microbial-host interactions. Despite some methodological concerns, the manuscript offers compelling data and opens up new avenues for research in the field of microbiome and circadian biology.

      We thank the reviewer for their encouraging remarks, specifically on the surprising findings that microbial metabolism seems to affect circadian clock gene expression and behavior differently in EAM and SPF mice.

      Weaknesses:

      The manuscript, while providing valuable insights, has several methodological weaknesses that impact the overall strength of the findings. First, the process for stool collection lacks clarity, raising concerns about potential biases, such as the risk of coprophagia, which could affect the dry-to-wet weight ratio analysis and compromise the validity of these measurements.

      We thank the reviewer for pointing out that our description of the specific methods used for collecting feces were presented in a somewhat confusing manner. In short, dry and wet fecal weights were determined based on fecal pellets that were freshly produced and directly collected from restrained mice. To determine total fecal output over time, we collected all fecal pellets produced in a 5 hour window in a cage, determined their dry weight, and then used the water content determined for fresh feces to calculate wet weight. Using this method, we cannot account for potential differences in coprophagia between the groups. However, this is not likely to affect the dry-to-wet ratio of fecal output in our results.

      Additionally, the use of the term "circadian" in some contexts appears inaccurate, as "diurnal" might be more appropriate, especially given the uncertainty regarding whether the observed microbiome fluctuations are truly circadian.

      Similarly to our answer to reviewer 1 above, we appreciate this remark about imprecise language and have addressed this issue in the text. Indeed, we do not think the microbiota fluctuations are truly circadian, but likely a result of the entrainment through the host's food intake.

      Another significant issue is the unexpected absence of an osmotic effect of lactulose in EAM mice, which contradicts the known properties of lactulose as an osmotic laxative. This finding requires further verification, including the use of a positive control, to ensure it is not artifactual.

      This is a good point. We have used this lactulose dosage specifically to induce microbial metabolism without causing osmotic diarrhea, and went to some lengths do demonstrate this. In response to this comment (and one by reviewer 3 below about transit time), we are planning an experiment that will use a higher lactulose dose as a positive control.

      The presentation of qRT-PCR data as log2-fold changes, with a mean denominator, could introduce bias by artificially reducing variability, potentially leading to spurious findings or increased risk of Type I error. This approach may explain the unexpected activation of both the positive and negative limbs of the circadian clock.

      While we agree that our description of the qpcr method used for measuring circadian clock gene expression was lacking detail, we do not see how log2-fold changes (as opposed to, e.g., fold change) would lead to an increased risk of Type 1 error. We did not use a mean denominator for analyzing the data but used the house-keeping data for the same sample as denominator for the respective circadian clock genes. This will be described more clearly in a revised methods section.

      Moreover, the lack of detailed information on the primers and housekeeping genes used in the experiments is concerning, particularly given the importance of using non-circadian housekeeping genes for accurate normalization.

      We apologize for this omission, it seems like the resource table got lost in the submission, leading to missing information. It will be included in the revised manuscript.

      The methods for measuring metabolic hormones, such as GLP-1 and GIP, are also not adequately described. If DPP-IV/protease inhibitor tubes were not used, the data could be unreliable due to the rapid degradation of these hormones by circulating proteases.

      We thank the reviewer for spotting this mistake. We will add details of how GLP-1 and GIP were measured to the methods section. While we did not use DPP-IV/protease inhibitor tubes, we added the inhibitors to the syringes when sampling blood, leading to the same effect.

      Finally, the manuscript does not address the collection of hormone levels during both fasting and fed phases, a critical aspect for interpreting the metabolic impact of microbial metabolites.

      We agree that it will be interesting to measure hormone levels also in the fed phase, and we will include this data in a revised version of the manuscript. Even with that data, a more thorough examination of hormone levels over the diurnal cycle, as suggested by reviewer 3, might be relevant for a full-scale follow-up. Given our data, we of course cannot exclude that there may be time-point-specific differences and therefore have softened the language around this conclusion to state that hormone levels are not acutely changed after a lactulose intervention “at the time-points examined”.

      These methodological concerns collectively weaken the robustness of the study's results and warrant careful reconsideration and clarification by the authors.

      Because of these weaknesses, the authors have partially achieved their aims by providing novel insights into the relationship between microbial metabolites and host circadian rhythms. The data do suggest that microbial metabolites can significantly influence feeding behavior and circadian gene expression in specific contexts. However, the unexpected absence of an osmotic effect of lactulose, the potential biases introduced by the log2-fold change normalization in qRT- PCR data, and the lack of clarity in critical methodological details weaken the overall conclusions. While the study provides valuable contributions to understanding the gut microbiome's role in circadian biology, the methodological weaknesses prevent a full endorsement of the authors' conclusions. Addressing these issues would be necessary to strengthen the support for their findings and fully achieve the study's aims.

      We thank the reviewer again for their careful and critical reading of our work, and for their constructive input. We hope that many of the concerns will be addressed by providing more methodological detail and additional experimental data in the revised version of our manuscript.

      Despite the methodological concerns raised, this work has the potential to make a significant impact on the field of circadian biology and microbiome research. The study's exploration of the interaction between microbial metabolites and host circadian rhythms in different microbial environments opens new avenues for understanding the complex interplay between the gut microbiome and host physiology. This research contributes to the growing body of evidence that microbial metabolites play a crucial role in regulating host behaviors and physiological processes, including feeding and circadian gene expression.

      We thank the reviewer for their encouraging remarks!

      Reviewer #3 (Public Review):

      Summary:

      In the manuscript by Greter, et al., entitled "Acute targeted induction of gut-microbial metabolism affects host clock genes and nocturnal feeding" the authors are attempting to demonstrate that an acute exposure to a non-nutritive disaccharide (lactulose) promotes microbial metabolism that feeds back onto the host to impact circadian networks. The premise of the study is interesting and the authors have performed several thoughtful experiments to dissect these relationships, providing valuable insights for the field. However, the work presented does not necessarily support some of the conclusions that are drawn. For instance, lactulose is administered during the fasting period to mimic the impact of a feeding bout on the gut microbiota, but it would be important to perform this treatment during the fed state as well to show that the effects on food intake, etc. do not occur.

      This is a good point, and we will include an experiment addressing this in a revised version of the manuscript.

      To truly draw the conclusion that the current outcomes are directly connected to and mediated via an impact on the host circadian clock, it would be ideal to perform these studies in a circadian gene knock-out animal (i.e., Cry1 or Cry2 KO mice, or perhaps Bmal-VilCre tissue- specific KO mice). If the effects are lost in these animals, this would more concretely connect the current findings to the circadian clock gene network.

      We agree that these would be interesting experiments to follow up on the question how the observed effects are actuated by host functions. However, they would require a large amount of preparatory work (including rederiving the KO mice to get them germ-free in our gnotobiotic facility), we argue that they are beyond the scope of this study.

      Despite these reservations, the work is promising.

      We thank the reviewer for their encouraging assessment.

      Strengths:

      Attempting to disentangle nutrient acquisition from microbial fermentation and its impact on diurnal dynamics of gut microbes on host circadian rhythms is an important step for providing insights into these host-microbe interactions.

      The authors utilize a novel approach in leveraging lactulose coupled with germ-free animals and metabolic cages fitted with detectors that can measure microbial byproducts of fermentation, particularly hydrogen, in real-time.

      The authors consider several interesting aspects of lactulose delivery, including how it shifts osmotic balance as well as provides calculations that attempt to explain the caloric contribution of fermentation to the animal in the context of reduced food intake. This provides interesting fundamental insights into the role of microbial outputs on host metabolism.

      Thank you!

      Weaknesses:

      While the authors have done a large amount of work to examine the osmotic vs. metabolic influence of lactulose delivery, the authors have not accounted for the enlarged cecum and increased cecal surface area in germ-free mice. The authors could consider an additional control of cecectomy in germ-free mice.

      We thank the reviewer for pointing out the potential effect of the anatomical differences of germ- free and conventionally colonized mice. We agree that when comparing germ-free mice to SPF mice, the enlarged cecum area in germ-free animals could lead to differences in water release or uptake. However, this is not the case in the gnotobiotic mice colonized with our minimal microbiota, which have comparable cecum sizes to germ-free mice, and thus comparing water transport over the cecum wall between those groups can be done without correcting for cecal surface areas. We will add information on cecum sizes in the different experimental groups to a revised version of the manuscript.

      The authors have examined GI hormones as one possible mechanism for how food intake is altered by microbial fermentation of lactulose. However, the authors measure PYY and GLP-1 only at a single time point, stating that there are no differences between groups. Given the goal of the studies is to tie these findings back into circadian rhythms, it would be important to show if the diurnal patterns of these GI hormones are altered.

      We fully agree that a deeper investigation of the diurnal fluctuations of hormone levels would be an interesting next step in studying whether perturbations in food intake can disturb these rhythms. Doing this for the whole rhythm would really require a full second study. For a revised version of this manuscript, we will add a second time-point of hormone measurements (during the fed phase) to this study. In addition, we will soften the statements made around these data to point out just that hormone level fluctuations could not be detected during specific time points after lactulose treatment, and therefore do not seem to explain the imminent behavioral changes.

      Considerations of other factors, such as conjugated vs. deconjugated bile acids, microbial bile salt hydrolase activity, and bile acid resorption, might be an important consideration for how lactulose elicits more influence on ileal circadian clock genes relative to cecum and colon.

      We absolutely agree that investigation of microbial bile acid modification and their metabolism by the host would be an interesting topic for a follow-up study.

      Measurements of GI transit time (both whole gut and regional) would be an important for consideration for how lactulose might be impacting the ileum vs. cecum vs. colon.

      This is also an interesting point, and we will add an assessment of transit time to a revised version of the manuscript.

    1. Author response:

      General comment:

      "This important study examined neuronal activity in the dentate nucleus of the cerebellum when monkeys performed a difficult perceptual decision-making task. The authors provide convincing evidence that the cerebellum represents sensory, motor, and behavioral outcome signals that are sent to the attentional system, but further analysis focusing on the disparity of performance between animals would improve the quality of the paper. This paper is of great general interest in that it shows the involvement of the cerebellum in cognitive processes at the neuronal level."

      We thank you for these general comments, and we agree with all of them. 

      Public Reviews (Reviewer #1):

      Summary:

      Recordings were made from the dentate nucleus of two monkeys during a decision-making task. Correlates of stimulus position and stimulus information were found to varying degrees in the neuronal activities. 

      We agree with this summary.

      Strengths:

      A difficult decision-making task was examined in two monkeys.

      We agree with this statement.

      Weaknesses:

      One of the monkeys did not fully learn the task. The manuscript lacked a coherent hypothesis to be tested, and no attempt was made to consider the possibility that this part of the brain may have little to do with the task that was being studied. 

      We understand these comments. It is correct that one of the monkeys did not fully learn the task, but it should be noted that both monkeys learned significantly above chance level, and we therefore find the recordings of both monkeys useful. We tested the hypothesis that neurons of the nucleus dentate can dynamically modulate their activity during a visual attention task, comprising not only sensorimotor but also cognitive attentional components. We agree that this hypothesis should be spelled out more explicitly in the introduction, which we will do in the revised version. We also appreciate the comment of this Reviewer that in our original submission we did not show our attempt to consider the possibility that this part of the brain may have little to do with the task that was being studied. We in fact did consider this possibility in that we applied muscimol to the dentate nucleus in one of the monkeys. The data of this one successful experiment show that the behaviour was reversibly affected in line with our hypothesis. Given that this only concerned one of the monkeys, we preferred not to present these data in the article. However, as the Reviewer correctly points out that this question remains hanging in the air, we will show them in our formal rebuttal letter. Please note that we decided to focus at the end of our research project on the tracing experiments, showing in both monkeys the connections of the dentate nucleus with the regions that are involved in attention. As a result, both monkeys have been sacrificed and we cannot expand upon our muscimol experiments anymore (which would have been useful indeed).

      Last but not least, given the comments of the Reviewers, we will also add a Supplementary figure to Figure 2, in which we will present the data for both monkeys separately and provide our interpretation. This may help to strengthen our conclusions. 

      Public Reviews (Reviewer #2):

      The authors trained monkeys to discriminate peripheral visual cues and associate them with planning future saccades of an indicated direction. At the same time, the authors recorded single-unit neural activity in the cerebellar dentate nucleus. They demonstrated that substantial fractions of DN cells exhibited sustained modulation of spike rates spanning task epochs and carrying information about stimulus, response, and trial outcome. Finally, tracer injections demonstrated this region of the DN projects to a large number of targets including several known to interconnect the visual attention network. The data compellingly demonstrate the authors' central claims, and the analyses are well-suited to support the conclusions. Importantly, the study demonstrates that DN cells convey many motor and nonmotor variables related to task execution, event sequencing, visual attention, and arguably decision-making/working memory.

      We thank the Reviewer for this positive and constructive feedback.

    2. eLife assessment

      This important study examined neuronal activity in the dentate nucleus of the cerebellum when monkeys performed a difficult perceptual decision-making task. The authors provide convincing evidence that the cerebellum represents sensory, motor, and behavioral outcome signals that are sent to the attentional system, but further analysis focusing on the disparity of performance between animals would improve the quality of the paper. This paper is of great general interest in that it shows the involvement of the cerebellum in cognitive processes at the neuronal level.

    3. Reviewer #1 (Public Review):

      Summary:

      Recordings were made from the dentate nucleus of two monkeys during a decision-making task. Correlates of stimulus position and stimulus information were found to varying degrees in the neuronal activities.

      Strengths:

      A difficult decision-making task was examined in two monkeys.

      Weaknesses:

      One of the monkeys did not fully learn the task. The manuscript lacked a coherent hypothesis to be tested, and no attempt was made to consider the possibility that this part of the brain may have little to do with the task that was being studied.

    4. Reviewer #2 (Public Review):

      The authors trained monkeys to discriminate peripheral visual cues and associate them with planning future saccades of an indicated direction. At the same time, the authors recorded single-unit neural activity in the cerebellar dentate nucleus. They demonstrated that substantial fractions of DN cells exhibited sustained modulation of spike rates spanning task epochs and carrying information about stimulus, response, and trial outcome. Finally, tracer injections demonstrated this region of the DN projects to a large number of targets including several known to interconnect the visual attention network. The data compellingly demonstrate the authors' central claims, and the analyses are well-suited to support the conclusions. Importantly, the study demonstrates that DN cells convey many motor and nonmotor variables related to task execution, event sequencing, visual attention, and arguably decision-making/working memory.

    1. eLife assessment

      How secretion is regulated during cell division and how membrane trafficking factors cooperate with the cytoskeleton during cell division remain poorly understood. In this work the authors find potential direct interactions between the polymeric septin cytoskeleton and the exocyst complex, using fission yeast as a model organism. The work provides a valuable body of new information that will be of great interest to the cell biology community. The evidence is strong and rigorous in many places but is incomplete in other respects.

    2. Reviewer #1 (Public Review):

      Summary:<br /> In this manuscript, Singh, Wu and colleagues explore functional links between septins and the exocyst complex. The exocyst in a conserved octameric complex that mediates the tethering of secretory vesicles for exocytosis in eukaryotes. In fission yeast cells, the exocyst is necessary for cell division, where it localizes mostly at the rim of the division plane, but septins, which localize in a similar manner, are non-essential. The main findings of the work are that septins are required for the specific localization of the exocyst to the rim of the division plane, and the likely consequent localization of the glucanase Eng1 at this same location, where it is known to promote cell separation. In the absence of septins, the exocyst still localizes to the division plane but is not restricted to the rim. They also show some defects in the localization of secretory vesicles and glucan synthase cargo. They further propose that interactions between septins and exocysts are direct, as shown through Alphafold2 predictions (of unclear strength) and clean coIP experiments.

      Strengths:<br /> The septin, exocyst and Eng1 localization data are well supported, showing that the septin rim recruits the exocyst and (likely consequently) the Eng1 glucanase at this location. One major finding of the manuscript is that of a physical interaction between septins and exocyst subunits. Indeed, many of the coIPs supporting this discovery are very clear.

      Weaknesses:<br /> I am less convinced by the strength of the physical interaction of septins with the exocyst complex. Notably, one important open question is whether septins interact with the intact exocyst complex, as claimed in the text, or whether the interactions occur only with individual subunits. The two-hybrid and coIP data only show weak interactions with individual subunits, and some coIPs (for instance Sec3 and Exo70 with Spn1 and Spn4) are negative, suggesting that the exocyst complex does not remain intact in these experiments. Given the known structure of the full exocyst complex and septin filaments (at least in S. cerevisiae), the Alphafold2 predicted structure could be used to probe whether the proposed interaction sites are compatible with full complex formation.

      The effect of spn1∆ on Eng1 localization is very clear, but the effect on secretory vesicles (Ypt3, Syb1) and glucan synthase Bgs1 is less convincing. The effect is small, and it is not clear how the cells are matched for the stage of cytokinesis.

    3. Reviewer #2 (Public Review):

      Summary:<br /> This interesting study implicates the direct interaction between two multi-subunit complexes, known as the exocyst and septin complexes, in the function of both complexes during cytokinesis in fission yeast. While previous work from several labs had implicated roles for the exocyst and septin complexes in cytokinesis and cell separation, this study describes the importance of protein:protein interaction between these complexes in mediating the functions of these complexes in cytokinesis. Previous studies in neurons had suggested interactions between septins and exocyst complexes occur but the functional importance of such interactions was not known. Moreover, in baker's yeast where both of these complexes have been extensively studied - no evidence of such an interaction has been uncovered despite numerous studies which should have detected it. Therefore while exocyst:septin interactions appear to be conserved in several systems, it appears likely that budding yeast are the exception--having lost this conserved interaction.

      Strengths:<br /> The strengths of this work include the rigorous analysis of the interaction using multiple methods including Co-IP of tagged but endogenously expressed proteins, 2 hybrid interaction, and Alphafold Multimer. Careful quantitative analysis of the effects of loss of function in each complex and the effects on localization and dynamics of each complex was also a strength. Taken together this work convincingly describes that these two complexes do interact and that this interaction plays an important role in post Golgi vesicle targeting during cytokinesis.

      Weaknesses:<br /> The authors used Alphafold Multimer to predict (largely successfully) which subunits were most likely to be involved in direct interactions between the complexes. It would be very interesting to compare this to a parallel analysis on the budding yeast septin and exocyst complexes where it is quite clear that detectable interactions between the exocyst and septins (using the same methods) do not exist. Presumably the resulting pLDDT scores will be significantly lower. These are in silico experiments and should not be difficult to carry out.

    4. Reviewer #3 (Public Review):

      Septins in several systems are thought to guide the location of exocytosis, and they have been found to interact with the exocyst vesicle-tethering complex in some cells. However, it is not known whether such interactions are direct or indirect. Moreover, septin-exocyst physical associations were not detected in several other systems, including yeasts, making it unclear whether such interactions reflect a conserved septin-exocytosis link or whether they may missed if they depend on septin polymerization or association into higher-order structures. Singh et. al., set out to define whether and how septins influence the exocyst during S. pombe cytokinesis. Based on three lines of evidence, the authors conclude that septins directly bind to exocyst subunits to regulate localization of the exocyst and vesicle secretion during cytokinesis.<br /> The conclusions are consistent with the data presented, but some interpretations need to be clarified and extended:

      (1) The first line of evidence examines septin and exocyst localization during cytokinesis in wild-type and septin-mutant or exocyst-mutant yeast. Quantitative imaging convincingly shows that the detailed localization of the exocyst at the division site is perturbed in septin mutants, and that this is accompanied by modest accumulation of vesicles and vesicle cargos. Whether that is sufficient to explain the increased thickness of the division septum in septin mutants remains unclear.

      (2) The second line of evidence involves a comprehensive Alphafold2 analysis of potential pair-wise interactions between septin and exocyst subunits. This identifies several putative interactions in silico, but it is unclear whether the identified interaction surfaces would be available in the full septin or exocyst complexes.

      (3) The third line of evidence uses co-immunoprecipitation and yeast two hybrid assays to show that several physical interactions predicted by Alphafold2 can be detected, leading the authors to conclude that they have identified direct interactions. However, both methods leave open the possibility that the interactions are indirect and mediated by other proteins in the fission yeast extract (co-IP) or budding yeast cell (two-hybrid).

      (4) Based on prior studies it would be expected that the large majority of both septins and exocyst subunits are present in cells and extracts as stoichiometric complexes. Thus, one would expect any septin-exocyst interaction to yield associations detectable with multiple subunits, yet co-IPs were not detectded in some combinations. It is therefore unclear whether the interactions reflect associations between fully-formed functional complexes or perhaps between transient folding intermediates.

    1. eLife assessment

      This useful study examined the associations of a healthy lifestyle with comprehensive and organ-specific biological ages defined using common blood biomarkers and body measures. Its large sample size, longitudinal design, and robust statistical analysis provide solid support for the findings, which will be of interest to epidemiologists and clinicians.

    2. Reviewer #1 (Public Review):

      Summary:

      This study was to examine the associations of a healthy lifestyle with comprehensive and organ-specific biological ages. It emphasized the importance of lifestyle factors in biological ages, which were defined using common blood biomarkers and body measures.

      Strengths:

      The data were from a large cohort study and defined comprehensive and six-specified biological ages.

      Weaknesses:

      (1) Since only 8.5% of participants from the CMEC (China Multi-Ethnic Cohort Study) were included in the study, has any section bias happened?

      (2) The authors should specify the efficiency of FFQ. How can FFQ genuinely reflect the actual intake? Moreover, how was the aMED calculated?

      (3) HLI (range) and HLI (category) should be clearly defined.

      (4) The comprehensive rationale and each specific BA construction should be clearly defined and discussed. For example, can cardiopulmonary BA be reflected only by using cardiopulmonary status? I do not think so.

      (5) The lifestyle index is defined based on an equal-weight approach, but this does not reflect reality and cannot fully answer the research questions it raises.

    3. Reviewer #2 (Public Review):

      This interesting study focuses on the association between lifestyle factors and comprehensive and organ-specific biological aging in a multi-ethnic cohort from Southwest China. It stands out for its large sample size, longitudinal design, and robust statistical analysis.

      Some issues deserve clarification to enhance this paper:

      (1) How were the biochemical indicators for organ-specific biological ages chosen, and are these indicators appropriate? Additionally, a more detailed description of the multi-organ biological ages should be provided to help understand the distribution and characteristics of BAs.

      (2) The authors categorized the HLI score into a dichotomous variable, which may cause a loss of information. How did the authors address this potential issue?

      (3) Because lifestyle data are self-reported, they may suffer from recall bias. This issue needs to be addressed in the limitations section.

      (4) It should be clarified whether the adjusted CA is the baseline value of CA. Additionally, why did the authors choose models with additional adjustments for time-invariant variables as their primary analysis? This approach does not align with standard FEM analysis (Lines 261-263).

      (5) How is the relative contribution calculated in the QGC analysis? The relative contribution of some lifestyle factors is not shown in Figure 2 and the supplementary figures, such as Supplementary Figure 7. These omissions should be explained.

    1. eLife assessment

      The study reports an important finding on the role of the global metabolic regulator Crp/cAMP in the formation of antibiotic persister Escherichia coli. The evidence supporting the claims is solid including metabolomic analysis and characterization of many mutant strains. However, batch culture-based methodologies are unreliable for studying the properties of persister cells that comprise only a fraction of the population and therefore leave the work incomplete.

    2. Reviewer #1 (Public Review):

      Summary:

      The authors set out to understand the role played by a key global metabolic regulator called Crp/cAMP in the formation of persister Escherichia coli that survive antibiotic treatment without acquiring genetic mutations.

      In order to achieve this aim, the authors employ an interdisciplinary approach exquisitely integrating standard microbiology assays with cutting-edge genomic, metabolomic, and proteomics screening.

      The data presented by the authors convincingly demonstrate that the deletion of two key genes that are part of the Crp/cAMP complex (i.e. crp and cyaA) leads to a significant decrease in the number of persisters, thus pointing towards a key role played by the Crp/cAMP complex in the formation of persisters in E. coli.

      The data presented also demonstrate that deletion of the crp gene leads to an overall decrease in energy metabolism and an overall increase in anabolic metabolism at the population level. It is not clear either what the contribution of the cyaA gene is in this respect, or why the deletion of cyaA has an opposite effect on cAMP concentration compared to crp deletion, although the authors present two reasonable untested hypotheses in the discussion. The authors might also want to explicitly acknowledge that these key data are obtained at the whole population level rather than at the level of the persister subpopulation.

      Finally, the authors convincingly show that the persisters they investigated are non-growing and have a higher redox activity and that the deletion of key genes involved in energy metabolism leads to a decrease in the number of persisters.

      These data will be key for future investigations on the biochemical mechanisms that allow bacteria to adapt to stressors such as nutrient depletion or exposure to antibiotics. As such this work will likely have an impact in a variety of fields such as bacterial biochemistry, antimicrobial resistance research, and environmental microbiology.

      Strengths:

      Interdisciplinary approach.<br /> Excellent use of replication and ensuring reproducibility.<br /> Excellent understanding and presentation of the biochemical mechanisms underpinning bacterial physiology via an integrated genomic, metabolomic, and proteomic screening.

      Weaknesses:

      Two genes from the Crp/cAMP complex (crp and cyaA) are hypothesised to be key for persistence but key metabolomics and proteomics data are obtained from only one deletion mutant in the crp gene.

      The deletion of crp and cyaA have opposite effects on the concentration of cAMP, a comparison of metabolomics and proteomics data obtained using both mutants might aid in understanding this difference.

      Metabolomics, proteomics, and metabolic activity data are obtained at the whole population level rather than at the level of the persister sub-population.

    3. Reviewer #2 (Public Review):

      Summary:

      The manuscript by Ngo et al investigated how bacterial persisters form in early and late stationary phases and found that cAMP-Crp regulated metabolic reprogramming affects persister formation that occurs in the late but not early stationary phase. Further metabolomic, proteomic, and genomic screening studies point to TCA cycle, ATP synthesis, respiratory chains, and oxidative phosphorylation correlating with persister abundance. If these conclusions can be solidly drawn, the work would add some new understanding of the underexplored topic of how persisters form.

      Strengths and weaknesses:

      Although the topic of understanding how persisters form is interesting and thus can be counted as a strength of the paper, most of the conclusions drawn by the authors are, at best, on shaky ground due to the following weakness.

      (1) The approaches used here are aimed at the major bacterial population, but yet the authors used the data reflecting the major population behavior to interpret the physiology of persister cells that comprise less than 1% of the major bacterial population. How they can pick up a needle from the hay without being fooled by the spill-over artifacts from the major population? Although it is probably very difficult to isolate and directly assay persister cells, firm conclusions for the type proposed by the authors cannot be firmly established without such assays. Perhaps introducing cyaA/crp mutation into the best example of persistence, the hipA-7 high persistence phenotype may clarify this issue to a certain extent.

      (2) The authors overlooked/omitted a recently published work regarding cyaA and crp (PMID: 35648826). In that work, a deficiency in cyaA or crp confers tolerance to diverse types of lethal stressors, including all lethal antimicrobials tested. How a mutation conferring pan-tolerance to the major bacterial population would lead to a less protective effect with a minor subpopulation? The authors are kind of obligated to discuss such a paradox in the context of their work because that is the most relevant literature for the present work. It is also very interesting if the cyaA/crp deficiency really has an opposing effect on tolerance and persistence. As a note, most of the conclusions from the omics studies of the present work have been reached in that overlooked literature, which addresses mechanisms of tolerance, a major rather than a minor population behavior. That supports comment #1 above. The inability of the authors to observe tolerance phenotype with the cyaA or crp mutant possibly derived from extremely high antimicrobial concentrations used in the study prevents tolerance phenotype from being observed because tolerance is sensitive to antimicrobial concentration while persistence is not.

      (3) The authors overly stressed the effect of cyaA/crp on persister formation but failed to test an alternative explanation of their effect on persister waking up after antimicrobial treatment. If the cyaA/crp-derived persisters are put into deeper sleep during antimicrobial treatment than wildtype-derived persisters, a 16-h recovery growth might have underestimated viable bacteria. This is often the case especially when extremely high concentrations of antimicrobials are used in performing persister assay. Thus, at least a longer incubation time (e.g. 48 and 72h) of agar plates for persister viable count needs to be performed to test such a scenario.

      (4) The rationale for using extremely high drug concentrations to perform persister assay is unclear. There are 2 issues with using extremely high drug concentrations. First, when overly high concentrations are used, drug removal becomes difficult. For example, a two-time wash will not be able to bring drug concentration from > 100 x MIC to below MIC. This is especially problematic with aminoglycoside because drug removal by washing does not work well with this class of compound. Second, overly high concentrations of drug use may make killing so rapidly and severely that may mask the difference from being observed between mutants and the control wild-type strain. In such cases, you would need to kill over a wide range of drug concentrations to find the right window to show a difference. The gentamicin data in the present work is likely the case that needs to be carefully examined. The mutants and the wild-type strain have very different MICs for gentamicin, but a single absolute drug concentration rather than concentrations normalized to MIC was used. This is like to compare a 12-year-old with a 21-year-old to run a 100-meter dash, which is highly inappropriate.

    4. Reviewer #3 (Public Review):

      Summary:

      The authors describe how E. coli in the late stationary phase have an active TCA cycle and respiration. Mutation of crp results in the down-regulation of TCA cycle genes and an upregulation of anabolic pathways and reduced persisters. Mutation of a variety of metabolic genes also resulted in fewer persisters in the late-stationary phase.

      Strengths:

      The work is vast, including metabolomic analysis and characterization of a large number of mutant strains. The identification of active respiration being required for persister cell survival in the late stationary phase is interesting. The induction of anabolic pathways resulting in the sensitization of bacteria to antibiotics is possibly the most interesting part of the paper.

      Weaknesses:

      The authors try to draw too many conclusions and it's difficult to identify what their actual findings are. For instance, they do not have any interesting findings with aminoglycosides but include the data and spend a lot of time discussing it, but it is really a distraction. The correlation between the induction of anabolic pathways in the crp mutant in the late stationary phase and the reduction in persisters is potentially very interesting but is buried in the paper with the vast quantities of data, and observations and conclusions that are often not well substantiated.

      The discussion section is particularly difficult to read and I recommend a large overhaul to increase clarity. For instance, what are the authors trying to conclude in section (iii) of the discussion? That persisters in the stationary phase have higher energy than other cells? Is there data to support that? All sections are similarly lacking in clarity.

      The large number of mutants characterized is a strength, but the quality of the data provided for those experiments is poor. Did some of these mutants lose fitness in the deep stationary phase in the absence of antibiotics? Did some reach a far lower cfu/ml in the stationary phase? These details are important and without them, it is difficult to interpret the data.

      There is ample analysis of persister formation in mutants in the pts/CRP pathway that is not discussed (Zeng et al PNAS 2022, Parsons et al PNAS, 2024).

      The authors do not discuss ROS production and antibiotic killing in these experiments. Presumably, the WT would have a greater propensity to produce ROS in response to antibiotics than the crp mutant, but it survives better. Is ROS not involved in antibiotic killing in these conditions?

    1. eLife assessment

      This study provides compelling data regarding the molecular characterization of a rare tumor type with few treatment options. This fundamental work significantly advances our mechanistic understanding of solitary fibrous tumours, a critical first step towards targeted precision medicine approaches. The results of this study will be of broad interest to cancer biologists and experimental oncologists.

    2. Joint Public Review:

      Solitary Fibrous Tumors (SFTs) are a rare malignancy defined by NAB2-STAT6 fusions. Because the molecular understanding of the disease is largely lacking, there are currently no targeted treatment approaches. Using primary tumor and adjacent normal tissue samples and cells inducibly expressing NAB2-STAT6, Hill et al. perform a detailed characterization of the transcriptomic and epigenomic NAB2-STAT6 SFT signatures. They identify enrichment or EGR1/NAB2 (but not STAT6) sites bound by the fusion protein and increased expression of EGR1 targets. Their studies indicate that NAB2-STAT6 fusion may direct the nuclear translocation of NAB2 and EGR1 proteins and potentially NAB1. Transcriptionally, NAB2-STAT6 SFTs most closely resemble neuroendocrine tumors.

      This pioneering study provides critical insight into the molecular pathogenesis of SFTs, pivotal for the future development of mechanistically informed treatment approaches. The study is rigorously executed and well-written. This new knowledge is an important addition to the field. Recommendations for minor improvements can be made.

    1. eLife assessment

      This is a useful study that generated a rich inventory of genetic interactions with the potential to produce new insight into the molecular function of Bam-associated proteins. The interactions with genes of unknown function are of special interest as they may suggest experiments to find the functions of these genes. The overall data provided to support their conclusions is solid, but there is a major concern with known polar effects on certain mutations, which should be addressed by complementation.

    2. Reviewer #1 (Public Review):

      Summary:

      The overall goal of the manuscript is to delineate pathways that are conditionally essential with the Bam complex and associated chaperones. The Bam complex is made of several proteins, including BamA and BamD, which are essential. The protein complex works to insert proteins in the asymmetric outer membrane. Substrates are translated in the cytoplasm prior to transport across the cell envelope to the Bam complex. Transport includes non-essential periplasmic chaperones, SurA, Skp, and DegP. According to the authors, the pathways were assumed to be redundant. The Bam complex also includes non-essential components, BamBCE. These were thought to be accessory components that interact with BamA and BamD to coordinate optimal activity. While some roles have been assigned to BamE and BamB, a detailed understanding of the role of each accessory Bam protein is lacking. In this study, more specific roles for each non-essential Bam component are proposed.

      Strengths:

      The overall findings are intriguing and could advance our understanding as to how the Gram-negative cell envelope is assembled. These studies could provide new targets for antimicrobial treatment. In general, the manuscript was well-written.

      Weaknesses:

      While the overall findings are interesting, I had some concerns with the data analysis, presentation, and conclusions. Not all the conclusions are supported by data. The proposed revisions include experimental and editorial work. The manuscript is generally well-written and could provide impactful data to advance the field if the concerns are addressed.

      Major concerns:

      Overall Comments:

      (1) The cutoffs the authors used to define "conditionally essential" mutants are not reported. The results also lack validation for lethality using a titratable system. It would be ideal to validate several genes in each dataset to determine cutoffs (i.e. 5-fold decrease in insertion mutants) for conditional lethality. It was not done (or described) here.

      (2) Also, two mutations that both make the cells sick could provide an additive effect (i.e. dapF and BamB), which doesn't necessarily mean the pathways are linked. The authors should revise their wording. They have not shown genetic linkage in some cases.

      (3) Mutations throughout the manuscript are not complemented. It would be ideal to add complementation data to show the gene-phenotype relationship is specific.

      (4) Also, I would argue the term "conditionally essential genes" should be replaced with "synthetically lethal". Strains were compared in the same conditions but with different genetic backgrounds.

    3. Reviewer #2 (Public Review):

      Summary:<br /> Bryant et al. apply phenotypic profiling and saturating transposon mutagenesis to investigate the role of the non-essential lipoproteins BamB, BamC, and BamE, along with chaperones DegP, Skp, and SurA, in the biogenesis of the bacterial outer membrane. This generated a set of genetic interactions that revealed that changes in LPS and outer membrane fluidity impact Bam activity, and that the cyclic form of enterobacterial common antigen becomes essential in the absence of the chaperone surA. The study also uncovers that peptidoglycan crosslinking and DNA replication control are conditionally essential with the absence of certain Bam components, suggesting a coordination between outer membrane protein (OMP) biogenesis and other cellular processes such as lipid and peptidoglycan synthesis, as well as DNA replication.

      Strengths:

      (1) This is probably the first comprehensive analysis of genetic interactions involving Bam-associated proteins and should provide rich insight to refine the mechanistic understanding of this complex machine and the process of OM biogenesis.

      (2) Good quality data and analysis. Well-presented manuscript.

      Weaknesses:

      (1) An important control in any genetic interaction study is to do complementation tests to demonstrate that the phenotype observed is indeed due to the missing gene under analysis. Although the Keio library was designed to avoid polar effects, it is impossible to predict other undesirable effects of the deletions (hitting of a non-annotated sRNA or RNA stability effects, for example). Thus, before one can safely conclude that a proposed genetic interaction is real, complementation tests should be carried out. This seems particularly important in the case of a new and surprising interaction, such as that between bamB and DNA replication and repair genes.

      (2) Why not include the suppressor interactions in the work? There are probably plenty, and in principle, they should be as informative as the conditional essential (or synthetic lethal) ones. The only one highlighted in the paper is that between bamB and diaA, since it nicely fits with the synthetic lethal effects with initiation inhibitors seqA and hda. Even if the authors cannot make sense of the suppressor interactions, their inclusion in the paper should make the dataset richer and more valuable to the community.

      (3) The enrichment analysis in Figure 2B deserves some clarification. What is the meaning of gene ratio? How can single genes of a pathway yield an enrichment signal? Why weren´t seqA and hda included in the DNA replication class in 2B?

      (4) The writing puts too much emphasis on demonstrating that bam lipoproteins and chaperones are specialized instead of fully redundant. However, I have the impression this is a long-settled conclusion in the field, as the manuscript itself describes at several points when reviewing the literature.

    4. Reviewer #3 (Public Review):

      In this work, Bryant, et al. investigate genetic interactions between non-essential members of the outer membrane protein biogenesis pathway and other genes in the genome using a transposon-directed insertion sequencing (TraDIS) approach in E. coli K-12. The authors identify interactions with other components of the envelope including LPS, peptidoglycan, and enterobacterial common antigen biogenesis, and they tie these interactions to specific members of the outer membrane biogenesis pathway. Although many of these interactions are known and have been previously investigated in the field, the study provides several synthetic phenotypes that could be useful for further investigations.

      The strengths of the paper include their unbiased, TraDIS approach, and follow up on the interactions they observe. The interactions with genes of unknown function also are of interest as they may suggest experiments to find the functions of these genes. The largest weakness of this paper is the use of a gene deletion allele for bamB that is known to be polar leading to decreased expression of an essential gene. This largely invalidates all results related to DNA replication. In addition, it is a weakness that the paper does not adequately address its place in the field through discussion of existing results on the interactions they investigate.

    5. Author response:

      We would like to thank the reviewers for their time and for their kind comments about our work. We expect that their comments will help us to improve the manuscript and so will plan the following experiments/revisions to address some of their comments:

      Reviewer 1 (Public Review):

      (1) The cutoffs the authors used to define "conditionally essential" mutants are not reported. The results also lack validation for lethality using a titratable system. It would be ideal to validate several genes in each dataset to determine cutoffs (i.e. 5-fold decrease in insertion mutants) for conditional lethality. It was not done (or described) here.

      We will report the cutoffs used when we generate the revised manuscript. Our experiments identified hundreds of lethal combinations and we have six datasets, validation of several genes from each would require generation of at least 20 depletion strains and subsequent testing of each. Validation using a depletion system would therefore be a significant undertaking and is typically not the standard when using these approaches. However, should time permit then we will attempt a subset of these experiments.

      (2) Also, two mutations that both make the cells sick could provide an additive effect (i.e. dapF and BamB), which doesn't necessarily mean the pathways are linked. The authors should revise their wording. They have not shown genetic linkage in some cases.

      We will revise the text to address this.

      (3) Mutations throughout the manuscript are not complemented. It would be ideal to add complementation data to show the gene-phenotype relationship is specific.

      We thank the reviewers for highlighting this and will complete the complementation experiments.

      (4) Also, I would argue the term "conditionally essential genes" should be replaced with "synthetically lethal". Strains were compared in the same conditions but with different genetic backgrounds.

      We take the reviewers point and will revise the text accordingly.

      Reviewer 2 (Public Review):

      Weaknesses:

      (1) An important control in any genetic interaction study is to do complementation tests to demonstrate that the phenotype observed is indeed due to the missing gene under analysis. Although the Keio library was designed to avoid polar effects, it is impossible to predict other undesirable effects of the deletions (hitting of a non-annotated sRNA or RNA stability effects, for example). Thus, before one can safely conclude that a proposed genetic interaction is real, complementation tests should be carried out. This seems particularly important in the case of a new and surprising interaction, such as that between bamB and DNA replication and repair genes.

      We thank the reviewers for highlighting this and will complete the complementation experiments.

      (2) Why not include the suppressor interactions in the work? There are probably plenty, and in principle, they should be as informative as the conditional essential (or synthetic lethal) ones. The only one highlighted in the paper is that between bamB and diaA, since it nicely fits with the synthetic lethal effects with initiation inhibitors seqA and hda. Even if the authors cannot make sense of the suppressor interactions, their inclusion in the paper should make the dataset richer and more valuable to the community.

      These data are available in supplementary table 1. However, we appreciate this is not obvious and so will make a new supplementary table and include a brief description of the data for the revised paper.

      (3) The enrichment analysis in Figure 2B deserves some clarification. What is the meaning of gene ratio? How can single genes of a pathway yield an enrichment signal? Why weren´t seqA and hda included in the DNA replication class in 2B?

      We apologise for the confusion caused and will include a description of the analysis in the methods section.

      (4) The writing puts too much emphasis on demonstrating that bam lipoproteins and chaperones are specialized instead of fully redundant. However, I have the impression this is a long-settled conclusion in the field, as the manuscript itself describes at several points when reviewing the literature.

      We will revise the text to reduce this emphasis.

      Reviewer #3 (Public Review):

      In this work, Bryant, et al. investigate genetic interactions between non-essential members of the outer membrane protein biogenesis pathway and other genes in the genome using a transposon-directed insertion sequencing (TraDIS) approach in E. coli K-12. The authors identify interactions with other components of the envelope including LPS, peptidoglycan, and enterobacterial common antigen biogenesis, and they tie these interactions to specific members of the outer membrane biogenesis pathway. Although many of these interactions are known and have been previously investigated in the field, the study provides several synthetic phenotypes that could be useful for further investigations.

      The strengths of the paper include their unbiased, TraDIS approach, and follow up on the interactions they observe. The interactions with genes of unknown function also are of interest as they may suggest experiments to find the functions of these genes. The largest weakness of this paper is the use of a gene deletion allele for bamB that is known to be polar leading to decreased expression of an essential gene. This largely invalidates all results related to DNA replication. In addition, it is a weakness that the paper does not adequately address its place in the field through discussion of existing results on the interactions they investigate.

      We appreciate the reviewers’ comments and concerns about the bamB allele, and we will address these concerns by completing complementation experiments for the CRISPRi depletion experiments and the run-out assays. However, despite the statement that it is known to be polar, several previous studies have also used the bamB Keio library strain. Many of these studies transfer the allele to a clean background and use the derivative in which the cassette has been removed as we have done here (Cox et al., 2017, Gunasinghe et al., 2018, Psonis et al., 2019, Storek et al., 2019, Ranava et al. 2021, Steenhuis et al., 2021, Thewasano et al., 2023). Therefore, we feel somewhat justified in our choice of strain.

      We are unable to find a reference for the Keio bamB strain causing polar effects and would have appreciated the reviewers’ guidance here. However, we believe the concern about polar effects stems from the observations of Ruiz et al., (2005), in which it was observed that a yfgL::ISE1 allele causes polar effects. This was hypothesised to be due to the ORF contained within the IS being transcribed in the opposite orientation to yfgL and the downstream der gene. They subsequently observed that a strain carrying a Tn5KAN-I-SceI insertion in yfgL (yfgL::kan) did not cause polar effects and this was hypothesised to be due to the kan cassette being co-oriented with yfgL. In addition, Charlson et al., 2006 generated a yfgL deletion by replacing the majority of the gene with a kan cassette in a manner similar to that of the Keio library that was subsequently flipped out. This study also found no evidence of polar effects on der. In theory, the strain used here, and in previous studies by other groups, should provide minimal disruption to transcription through generation of a mini-gene from the original bamB sequence to maintain operon expression. This is in contrast to the disruption caused by the yfgL::ISE1 allele.

      While we do appreciate the concern, several pieces of evidence lend themselves to counter the statement that our strain choice largely invalidates the results. The der GTPase is essential, hence the concern about polar effects leading to the bamB phenotypes we see. However, depletion of der leads to cold sensitivity, whereas we find that the bamB strain used here actually performs better in colder temperatures. In addition, the der depletion is sensitive to doxycycline, whereas the bamB mutant has increased fitness in this condition (Fig 1) (Bharat and Brown, 2015, Hwang and Inouye, 2008). Hence, should the mutation lead to decreased expression of der then we would expect the bamB strain to phenocopy the der depletion, which it does not. Regardless of this information, we will still address these concerns by completing complementation experiments.

    1. eLife assessment

      This valuable study confirms the roles of Dact1 and Dact2, two factors involved in Wnt signaling, during zebrafish gastrulation and demonstrates their genetic interactions with other Wnt components to modulate craniofacial morphologies. The limitation of the study is that it does not distinguish primary from secondary effects for each factor, precluding an unambiguous interpretation of their roles in craniofacial morphogenesis. The findings of a new potential target of dact1/2-mediated Wnt signaling are potentially of value; however, experimental evidence supporting their functional significance remains incomplete due to inconsistent results and the inherent limitations of the overexpression study.

    2. Reviewer #1 (Public Review):

      Summary:

      This study explores the roles of dact1 and dact2 in zebrafish embryonic axis formation and craniofacial morphogenesis. The researchers aim to uncover the mechanisms by which dact1/2 modulates Wnt signaling during embryonic development and patterning. They propose distinct spatiotemporal roles for Dact1 and Dact2 proteins in zebrafish embryonic development, particularly their involvement in modulating noncanonical Wnt signaling during convergent extension events. The findings demonstrate that dact1 and dact2 have unique spatiotemporal expression domains during development and that mutations in dact1/2 lead to convergent extension defects. Furthermore, the study attempts to link these defects to craniofacial abnormalities resulting from dact1/2 mutations. Compound mutants were used to investigate the connection between dact1 and dact2, as single mutants did not exhibit craniofacial phenotypes. The research also includes comprehensive transcriptomics and pathway analyses of differentially expressed genes in dact1/2 mutants, revealing the overexpression of calpain 8, a calcium-dependent cysteine protease. The study suggests that the upregulation of calpain 8 is linked to the observed craniofacial dysmorphology in dact1/2 mutants, implying a potential connection between calpain 8 expression and craniofacial abnormalities.

      Strengths:

      • The study effectively recapitulates previous findings on the role of dact1/2 in modulating convergent extension during zebrafish embryogenesis.<br /> • A combination of multiple approaches, including in vivo time-lapse imaging, is used to elucidate the etiology of the rod-like neurocranial phenotype in dact1/2 double mutants.<br /> • The study utilizes both traditional and newly created mutant lines, analyzing them through single-cell transcriptomics.

      Weaknesses:

      (1) The authors successfully addressed reviewers' suggestions with revised experiments and explanations. However, the overall narrative struggles to build a more coherent storyline.<br /> (2) The potential activity of truncated and upregulated dact mRNAs (Fig S2) and partially functional dact proteins needs further clarification.<br /> (3) Data-rich figures, specifically Figs 6, 7, and 8D, could be simplified for better clarity.

    3. Reviewer #2 (Public Review):

      Summary:

      Non-canonical Wnt signaling plays an important role in morphogenesis, but how different components of the pathway are required to regulate different developmental events remains an open question. This paper focuses on elucidating the overlapping and distinct functions of dact1 and dact2, two Dishevelled-binding scaffold proteins, during zebrafish axis elongation and craniofacial development. By combining genetic studies, detailed phenotypic analysis, lineage tracing, and single cell RNA-sequencing, the authors aimed to understand (1) the relative function of dact1/2 in promoting axis elongation, (2) their ability to modulate phenotypes caused by mutations in other non-canonical wnt components, and (3) pathways downstream of dact1/2.<br /> Corroborating previous findings, this paper showed that dact1/2 is required for convergent extension during gastrulation and body axis elongation. Strong qualitative evidence was also provided to support dact1/2's role in genetically modulating non-canonical wnt signaling to regulate body axis elongation and the morphology of the ethmoid plate (EP). However, the spatiotemporal function of dact1/2 remains unknown. The use of scRNA-seq identified novel pathways and targets downstream of dact1/2. Calpain 8 is one such example, and its overexpression in some of the dact1/2+/- embryos was able to phenocopy the dact1/2-/- mutant EP morphology, pointing to its sufficiency in driving the EP phenotype in a few embryos. However, the same effect was not observed in dact1-/-; dact2+/- embryos, leading to the question of how significant calpain 8 really is in this context. The requirement of calpain 8 in mediating the phenotype is unclear as well. This is the most novel aspect of the paper, but some weaknesses remain in convincingly demonstrating the importance of calpain 8.

      Strengths:

      (1) The generation of dact1/2 germline mutants and the use of genetic approaches to dissect their genetic interactions with wnt11f2 and gpc4 provide unambiguous and consistent results that inform the relative functions of dact1 and dact2, as well as their combined effects.<br /> (2) Because the ethmoid plate exhibits a spectrum of phenotypes in different wnt genetic mutants, it is a useful system for studying how tissue morphology can be modulated by different components of the wnt pathway, as demonstrated in this study.<br /> (3) The authors leveraged lineage tracing by photoconversion to dissect how dact1/2 differentially impacts the ability of different cranial neural crest populations to contribute to the anterior neurocranium. This revealed that distinct mechanisms via dact1/2 and shh can lead to similar phenotypes.<br /> (4) The use of scRNA-seq was a powerful approach and identified potential novel pathways and targets downstream of dact1/2.

      Weaknesses:

      (1) Expression of dact1/2 and wnt11f2: Certain claims regarding the expression similarity between dact2 and wnt11f2 is not clearly demonstrated in figures and the text description of dact1/2 and wnt11f2 expression for the Daniocell scRNA-seq tool is also somewhat confusing. As the paper makes claim that dact1/2 may function in the same pathway as wnt11f2, their expression should be accurately described and used to draw conclusion on what tissue types such a signaling may take place.<br /> (2) Spatiotemporal function of dact1/2: Germline mutations limit the authors' ability to study a gene's spatiotemporal functional requirement. They, therefore, cannot concretely attribute nor separate early-stage phenotypes (during gastrulation) to/from late stage phenotypes (EP morphological changes), which the authors postulated to result from secondary defects in floor plate and eye field morphometry.<br /> (3) The functional significance of calpain 8: The authors showed that calpain 8 was upregulated in the mutant and subsequently tested its function by overexpressing dact1/2 mRNA in embryos. While only 1 out of 142 calpain-overexpressing wild type animals phenocopied dact1/2 mutants, 7.5% of dact1/2+/- embryos did exhibit the phenotype. However, the same effect was not observed in dact1-/-; dact2+/- embryos and the requirement of calpain 8 in driving the phenotype remains unclear.

    4. Reviewer #3 (Public Review):

      Summary:

      In this manuscript the authors explore the roles of dact1 and dact2 during zebrafish gastrulation and craniofacial development. Previous studies used morpholino (MO) knockdowns to show that these scaffolding proteins, which interact with dishevelled (Dsh), are expressed during zebrafish gastrulation and suggested that dact1 promotes canonical Wnt/B-catenin signaling, while dact2 promotes non-canonical Wnt/PCP-dependent convergent-extension (Waxman et al 2004). This study goes beyond this work by creating loss-of-function mutant alleles for each gene and unlike the MO studies finds little (dact2) to no (dact1) phenotypic defects in the homozygous mutants. Interestingly, dact1/2 double mutants have a more severe phenotype, which resembles those reported with MOs as well as homozygous wnt11/silberblick (wnt11/slb) mutants that disrupt non-canonical Wnt signaling (Heisenberg et al., 1997; 2000). Further analyses in this paper try to connect gastrulation and craniofacial defects in dact1/2 mutants with wnt11/slb and other wnt-pathway mutants. scRNAseq conducted in mutants identifies calpain 8 as a potential new target of dact1/2 and Wnt signaling.

      Previous comments:

      Strengths:

      When considered separately the new mutants are an improvement over the MOs and the paper contains a lot of new data.

      Weaknesses:

      However, the hypotheses are very poorly defined and misinterpret key previous findings surrounding the roles of wnt11 and gpc4, which results in a very confusing manuscript. Many of the results are not novel and focus on secondary defects. The most novel result overexpressing calpain8 in dact1/2 mutants is preliminary and not convincing.

      Comment on the revised version:

      The authors addressed some of our comments, but not our main criticisms, which we reiterate here:

      (1) The authors argue that morpholino studies are unreliable and here they made new mutants to solve this uncertainty for dap 1/2. However, creating stable mutant lines to largely confirm previous results obtained by using morpholino knock-down phenotypes does not justify publication in eLife.

      (2) The authors argue that since it has not been shown conclusively that craniofacial defects in wnt11 and dap1/2 mutants are secondary to gastrulation defects there is no solid evidence preventing them from investigating these craniofacial defects. However, since it is extremely likely that the rod-like ethmoid plates of wnt11f2- and dact1/2 mutants focused on here are secondary to gastrulation defects previously described by others (Heisenberg and NussleinVolhard 1997; Waxman et al., 2004), the burden of proof is on the authors to provide much stronger evidence against this interpretation.

      (3) The data for calpain overexpression remains too preliminary.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Weakness 1. Enhancing Reproducibility and Robustness: To enhance the reproducibility and robustness of the findings, it would be valuable for the authors to provide specific numbers of animals used in each experiment. Explicitly stating the penetrance of the rod-like neurocranial shape in dact1/2-/- animals would provide a clearer understanding of the consistency of this phenotype. 

      In Fig. 3 and Fig. 4 animal numbers were added to the figure and figure legend (line 1111). In Fig. 5 animal numbers were added to the figure. We now state that dact1/2-/- animals exhibit the rod-like neurocranial shape that is completely penetrant (Line 260). 

      Weakness 2. Strengthening Single-Cell Data Interpretation: To further validate the single-cell data and strengthen the interpretation of the gene expression patterns, I recommend the following: 

      -Provide a more thorough explanation of the rationale for comparing dact1/2 double mutants with gpc4 mutants.

      -Employ genotyping techniques after embryo collection to ensure the accuracy of animal selection based on phenotype and address the potential for contamination of wild-type "delayed" animals.

      -Supplement the single-cell data with secondary validation using RNA in situ or immunohistochemistry techniques. 

      An explanation of our rationale was added to the results section (Lines 391403) and a summary schematic was added to Figure 6 (panel A).

      Genotyping of the embryos was not possible but quality control analysis by considering the top 2000 most variable genes across the dataset showed good clustering by genotype, indicating the reproducibility of individuals in each group (See Supplemental Fig. 4).

      The gene expression profiles obtained in our single-cell data analysis for gpc4, dact1, and dact2 correlate closely with our in situ hybridization analyses. Further, our data is consistent with published zebrafish single-cell data. We validated our finding of increased capn8 expression in dact1/2 mutants by in situ hybridization. Therefore we are confident in the robustness of our single-cell data.  

      Weakness 3. Directly Investigating Non-Cell-Autonomous Effects: To directly assess the proposed non-cell-autonomous role of dact1/2, I suggest conducting transplantation experiments to examine the ability of ectodermal/neural crest cells from dact1/2 double mutants to form wild-type-like neurocranium.  

      The reviewer’s suggestion is an excellent experiment and something to consider for future work. Cell transplant experiments between animals of specific genotypes are challenging and require large numbers. It is not possible to determine the genotype of the donor and recipient embryos at the early timepoint of 1,000 cell stage where the transplants would have to be done in the zebrafish. So that each transplant will have to be carried out blind to genotype from a dact1+/-; dact2+/- or dact1-/-; dact2+/- intercross and then both animals have to be genotyped at a subsequent time point, and the phenotype of the transplant recipient be analyzed. While possible, this is a monumental undertaking and beyond the scope of the current study.

      Weakness 4. Further Elucidating Calpain 8's Role: To strengthen the evidence supporting the critical role of Calpain 8, I recommend conducting overexpression experiments using a sensitized background to enhance the statistical significance of the findings. 

      We thank the reviewer for their suggestion and have now performed capn8 overexpression experiments in embryos generated from dact1/2 double heterozygous breeding. We found a statistically significant effect of capn8 overexpression in the dact1+/-,dact2+/- fish (Lines 462-464 and Fig. 8C,D). 

      Minor Comments:  

      Comment: Creating the manuscript without numbered pages, lines, or figures makes orientation and referencing harder.  

      Revised

      Comment: Authors are inconsistent in the use of font and adverbs, which requires extra effort from the reader. ("wntIIf2 vs wnt11f2 vs wnt11f2l"; "dact1/2-/- vs dact1/dact2 -/-"; "whole-mount vs wholemount vs whole mount").  

      Revised throughout.

      Comment: Multiple sentences in the "Results" belong to the "Materials and Methods" or the "Discussion" section. 

      We have worked to ensure that sentences are within the appropriate sections of the manuscript.

      Comment: Abstract:

      "wnt11f2l" should be "wnt11f2"  

      Revised (Line 24).

      Comment: Main text:

      Page 5 - citation Waxman, Hocking et al. 2004 is used 3x without interruption any other citation. 

      Revised (Line 112).

      Page 9 - "dsh" mutant is mentioned once in the whole manuscript - is this a mistake?

      Revised, Rewritten (Line 196).

      Page 10 - Fig 2B does not show ISH.

      Revised (Line 229).

      Page 11 - "kyn" mutant is mentioned here for the first time but defined on page 15.

      Revised (Line 245). Now first described on page 4.

      Page 14 - "cranial CNN" should be CNCC.

      Revised. (Line 334)

      Page 16 - dact1/dact2/gpc4: Fig. 5C is used but it should be Fig 5E.

      Revised. (Line 381)

      Page 18 - dact1/2-/- or dact1-/-, dact2-/-. 

      Revised. (Line 428)

      Comment: Methods:

      Page 24 - ZIRC () "dot" is missing. ChopChop ")" is missing. "located near the 5' end of the gene" - In the Supplementary Figure 1 looks like in the middle of the gene.

      Revised. (Lines 600, 609, 611, respectively).

      Page 25 - WISH -not used in the main text.

      Revised. (Line 346).

      Page 26 - 4% (v/v) formaldehyde; at 4C - 4{degree sign}C; 50% (v/v) ethanol; 3% (w/v) methylcellulose.

      Revised. (Lines 659, 660, 662).

      Page 27 - 0.1% (w/v) BSA. 

      Revised. (Line 668).

      Comment: Discussion:

      The overall discussion requires more references and additional hypotheses. On page 20, when mentioning 'as single mutants develop normally,' does this refer to the entire animals or solely the craniofacial domain? Are these mutants viable? If they are, it's crucial to discuss this phenomenon in relation to prior morpholino studies and genetic compensation.

      Observing how the authors interpret previously documented changes in nodal and shh signaling would be beneficial. While Smad1 is discussed, what about other downstream genes? Is shh signaling altered in the dact1/2 double mutants? 

      We have revised the Discussion to include more references (Lines 473, 476, 483, 488, 491, 499, 501, 502, 510, 515, 529, 557, 558) and additional hypotheses (Lines 503-505, 511-519, 522-525). We have added more specific information regarding the single mutants (Lines 270-275, 480-493, Fig. S3). We have added discussion of other downstream genes, including smad1 (Lines 561-572) and shh (Lines 572-580).

      Comment: Figures:

      Appreciating differences between specimens when eyes were or were not removed is quite hard.

      Yes this was an unfortunate oversight, however, the key phenotype is the EP shown in the dissections.

      Fig 1. - wntIIf2 vs wnt11f2? C - Thisse 2001 - correct is Thisse et al. 2001.

      Revised typo in Fig 1. (And Line 1083).

      Fig 1E: These plots are hard to understand without previous and detailed knowledge. Authors should include at least some demarcations for the cephalic mesoderm, neural ectoderm, mesenchyme, and muscle. Missing color code.

      We have moved this data to supplementary figure S1 and have added labels of the relevant cell types and have added the color code.

      Comment:- Fig 2 - In the legend for C - "wildtype and dact2-/- mutant" and "dact1/2 mutant"; in the picture is dact1-/-, dact2-/-.

      Revised (Line 1105).

      Fig 2 - B - it is a mistake in 6th condition dact1: 2x +/+, heterozygote (+/-) is missing.

      Revised Figure 2B.

      Fig 4. - Typo in the legend: dact1/"t"2-/- .

      Revised. (Line 1127).

      Fig 8C - In my view, when the condition gfp mRNA says "0/197, " none of the animals show this phenotype. I assume the authors wanted to say that all the animals show this phenotype; therefore, "197/197" should be used.

      We have removed this data from the figure as there were concerns by the reviewers regarding reproducibility. 

      Fig S1 - Missing legend for the 28 + 250, 380 + 387 peaks? RT-qPCR - is not mentioned in the Materials and Methods. In D - ratio of 25% (legend), but 35% (graph).

      Revised.(Line 1203, Line 625, Line 1213, respectively).

      Fig S2 - The word "identified" - 2x in one sentence. 

      Revised. (Line 1230).

      Reviewer #2 (Public Review):

      Weakness(1) While the qualitative data show altered morphologies in each mutant, quantifications of these phenotypes are lacking in several instances, making it difficult to gauge reproducibility and penetrance, as well as to assess the novel ANC forms described in certain mutants.  

      In Fig. 3 and Fig. 4 animal numbers were added to the figure legend. In Fig. 5 animal numbers were added to the figure to demonstrate reproducibility. We now state that dact1/2-/- animals exhibit the rod-like neurocranial shape that is completely penetrant (Line 260). As the altered morphologies that we report are qualitatively significant from wildtype we did not find it necessary to make quantitative measurements. For experiments in which it was necessary to in-cross triple heterozygotes (Fig 3, Fig. 5), we dissected and visually analyzed the ANC of at least 3 compound mutant individuals. At least one individual was dissected for the previously published or described genotypes/phenotypes (i.e. wt, wntllf2-/-, dact1/2-/-, gpc4-/-, wls/-). We realize quantitative measurements may identify subtle differences between genotypes. However, the sheer number of embryos needed to generate these relatively rare combinatorial genotypes and the amount of genotyping required prevented quantitative analyses. 

      Weakness 2) Germline mutations limit the authors' ability to study a gene's spatiotemporal functional requirement. They therefore cannot concretely attribute nor separate early-stage phenotypes (during gastrulation) to/from late-stage phenotypes (ANC morphological changes). 

      We agree that we cannot concretely attribute nor separate early and latestage phenotypes. Conditional mutants to provide temporal or cell-specific analysis are beyond the scope of this work. Here we speculate based on evidence obtained by comparing and contrasting embryos with grossly similar early phenotypes and divergent late-stage phenotypes. We believe our findings contribute to the existing body of literature on zebrafish mutants with both early convergent extension defects and craniofacial abnormalities.   

      Weakness (3) Given that dact1/2 can regulate both canonical and non-canonical wnt signaling, this study did not specifically test which of these pathways is altered in the dact1/2 mutants, and it is currently unclear whether disrupted canonical wnt signaling contributes to the craniofacial phenotypes, even though these phenotypes are typical non-canonical wnt phenotypes. 

      Previous literature has attributed canonical wnt, non-canonical wnt, and nonwnt functions to dact, and each of these likely contributes to the dact mutant phenotype (Lines 87-89). We performed cursory analyses of tcf/lef:gfp expression in the dact mutants and did not find evidence to support further analysis of canonical wnt signaling in these fish. Single-cell RNAseq did not identify differential expression of any canonical or non-canonical wnt genes in the dact1/2 mutants.

      Further research is needed to parse out the intracellular roles of dact1 and dact2 in response to wnt and tgf-beta signaling. Here we find that dact may also have a role in calcium signaling, and further experiments are needed to elaborate this role.      

      Weakness (4) The use of single-cell RNA sequencing unveiled genes and processes that are uniquely altered in the dact1/2 mutants, but not in the gpc4 mutants during gastrulation. However, how these changes lead to the manifested ANC phenotype later during craniofacial development remains unclear. The authors showed that calpain 8 is significantly upregulated in the mutant, but the fact that only 1 out of 142 calpainoverexpressing animals phenocopied dact1/2 mutants indicates the complexity of the system. 

      To further test whether capn8 overexpression may contribute to the ANC phenotype we performed overexpression experiments in the resultant embryos of dact1/dact2 double het incross. We found the addition of capn8 caused a small but statistically significant occurrence of the mutant phenotype in dact1/2 double heterozygotes (Fig.8D). We agree with the reviewer that our results indicate a complex system of dysregulation that leads to the mutant phenotype. We hypothesize that a combination of gene dysregulation may be required to recapitulate the mutant ANC phenotype. Further, as capn8 activity is regulated by calcium levels, overexpression of the mRNA alone likely has a small effect on the manifestation of the phenotype. 

      Weakness (5) Craniofacial phenotypes observed in this study are attributed to convergent extension defects but convergent extension cell movement itself was not directly examined, leaving open if changes in other cellular processes, such as cell differentiation, proliferation, or oriented division, could cause distinct phenotypes between different mutants. 

      Although convergent extension cell movements were not directly examined, our phenotypic analyses of the dact1/2 mutant are consistent with previous literature where axis extension anomalies were attributed to defects in convergent extension (Waxman 2004, Xing 2018, Topczewski 2001). We do not attribute the axis defect to differentiation differences as in situ analyses of established cell type markers show the existence of these cells, only displaced relative to wildtype (Figure 1). We agree that we cannot rule out a role for differences in apoptosis or proliferation however, we did not detect transcriptional differences in dact1/2 mutants that would indicate this in the single-cell RNAseq dataset. Defects in directed division are possible, but alone would not explain that dact1/2 mutant phenotype, particularly the widened dorsal axis (Figure 1).

      Major comments:  

      Comment (1) The author examined and showed convergent extension phenotype (CE) during body axis elongation in dact1/dact2-/- homozygous mutants. Given that dact2-/- single mutants also displayed shortened axis, the authors should either explain why they didn't analyze CE in dact2-/- (perhaps because that has been looked at in previously published dact2 morphants?) or additionally show whether CE phenotypes are present in dact1 and dact2 single mutants.  

      The authors should quantify the CE phenotype in both dact2-/- single mutants and dact1/dact2-/- double mutants, and examine whether the CE phenotypes are exacerbated in the double mutants, which may lend support to the authors' idea that dact1 can contribute to CE. The authors stated in the discussion that they "posit that dact1 expression in the mesoderm is required for dorsal CE during gastrulation through its role in noncanonical Wnt/PCP signaling". However, no evidence was presented in the paper to show that dact1 influences CE during body axis elongation.  

      Because any axis shortening in shortening in dact2-/- single mutants was overcome during the course of development and at 5 dpf there was no noticeable phenotype, we did not analyze the single mutants further.  

      We have added data to demonstrate the resulting phenotype of each combinatorial genotype to provide a more clear and detailed description of the single and compound mutants (Fig. S3). 

      Our hypothesis that dact1 may contribute to convergent extension is based on its apparent ability to compensate (either directly or indirectly) for dact2 loss in the dact2-/- single mutant. 

      Comment (2) Except in Fig. 2, I could not find n numbers given in other experiments. It is therefore unclear if these mutant phenotypes were fully or partially penetrant. In general, there is also a lack of quantifications to help support the qualitative results. For example, in Fig. 4, n numbers should be given and cell movements and/or contributions to the ANC should be quantified to statistically demonstrate that the second stream of CNCC failed to contribute to the ANC.  

      Similarly, while the fan-shaped and the rod-shaped ANCs are very distinct, the various rod-shaped ANCs need to be quantified (e.g. morphometry or measurements of morphological features) in order for the authors to claim that these are "novel ANC forms", such as in the dact1/2-/-, gpc4/dact1/2-/-, and wls/dact1/2-/- mutants (Fig. 5).  

      We have added n numbers for each experiment and stated that the rod-like phenotype of the dact1/2-/- mutant was fully penetrant. 

      Regarding CNCC experiments, we repeated the analysis on 3 individual controls and mutants and did not find evidence that CNCC migration was directly affected in the dact1/2 mutant. Rather, differences in ANC development are likely secondary to defects in floor plate and eye field morphometry. Therefore we did not do any further analyses of the CNCCs.

      Regarding figure 5, we have added n numbers. We dissected and analyzed a minimum of three triple mutants (dact1/2-/-,gpc4-/- and dact1/2-/-,wls-/-) and numerous dact1/s double mutants and found that the triple mutant ANC phenotype was consistent and recognizably different enough from the dact1/2-/-, or gpc4 or wls single mutant that morphometry measurements were not needed. Further, the triple mutant phenotype (narrow and shortened) appears to be a simple combination of dact1/2 (narrow) and gpc4/wls (shortened) phenotypes. As we did not find evidence of genetic epistasis, we did not analyze the novel ANC forms further.

      Comment (3): The authors have attributed the ANC phenotypes in dact1/2-/- to CE defects and altered noncanonical wnt signaling. However, no evidence was presented to support either. The authors can perhaps utilize diI labelling, photoconversionmediated lineage tracing, or live imaging to study cell movement in the ANC and compare that with the cell movement change in the gpc4-/- , and gpc4/dact1/2-/- mutants in order to first establish that dact1/2 affect CE and then examine how dact1/2 mutations can modulate the CE phenotypes in gpc4-/- mutants.  

      Concurrently, given that dact1 and dact2 can affect (perhaps differentially) both canonical and non-canonical wnt signaling, the authors are encouraged to also test whether canonical wnt signaling is affected in the ANC or surrounding tissues, or at minimum, discuss the potential role/contribution of canonical wnt signaling in this context.  

      Given the substantial body of research on the role of noncanonical wnt signaling and planar cell polarity pathway on convergent extension during axis formation (reviewed by Yang and Mlodzik 2015, Roszko et al., 2009) and the resulting phenotypes of various zebrafish mutants (i.e. Xing 2018, Topczewski 2001), including previous research on dact1 and 2 morphants (Waxman 2004), we did not find it necessary to analyze CE cell movements directly.  

      Our finding that CNCC migration was not defective in the dact1/2 mutants and the knowledge that various zebrafish mutants with anterior patterning defects (slb, smo, cyc) have a similar craniofacial abnormality led us to conclude that the rod-like ANC in the dact1/2 mutant was secondary to an early patterning defect (abnormal eye field morphology). Therefore, testing dact1/2 and convergent extension or wnt signaling in the ANC itself was not an aim of this paper.  

      Comment (4) The authors also have not ruled out other possibilities that could cause the dact1/2-/- ANC phenotype. For example, increased cell death or reduced proliferation in the ANC may result in the phenotype, and changes in cell fate specification or differentiation in the second CNCC stream may also result in their inability to contribute to the ANC. 

      We agree that we cannot rule out whether cell death or proliferation is different in the dact1/2 mutant ANC. However, because we do not find the second CNCC stream within the ANC, this is the most likely explanation for the abnormal ANC shape. Because the first stream of CNCC are able to populate the ANC and differentiate normally, it is most likely that the inability of the second stream to populate the ANC is due to steric hindrance imposed by the abnormal cranial/eye field morphology. These hypotheses would need to be tested, ideally with an inducible dact1/2 mutant, however, this is beyond the scope of this paper.     

      Comment (5) The last paragraph of the section "Genetic interaction of dact1/2 with Wnt regulators..." misuses terms and conflates phenotypes observed. For instance, the authors wrote "dact2 haploinsuffciency in the context of dact1-/-; gpc4-/- double mutant produced ANC in the opposite phenotypic spectrum of ANC morphology, appearing similar to the gpc4-/- mutant phenotype". However, if heterozygous dact2 is not modulating phenotypes in this genetic background, its function is not "haploinsuffcient". The authors then said, "These results show that dact1 and dact2 do not have redundant function during craniofacial morphogenesis, and that dact2 function is more indispensable than dact1". However this statement should be confined to the context of modulating gpc4 phenotypes, which is not clearly stated. 

      Revised (Lines 380, 382).   

      Comment (6) For the scRNA-seq analysis, the authors should show the population distribution in the UMAP for the 3 genotypes, even if there are no obvious changes. The authors are encouraged, although not required, to perform pseudotime or RNA velocity analysis to determine if differentiation trajectories are changed in the NC populations, in light of what they found in Fig. 4. The authors can also check the expression of reporter genes downstream of certain pathways, e.g. axin2 in canonical wnt signaling, to query if these signaling activities are changed (also related to point #3 above). 

      We have added population distribution data for the 3 genotypes to Supplemental Figure 4. Although RNA velocity analysis would be an interesting additional analysis, we would hypothesize that the NC population is not driving the differences in phenotype. Rather these are likely changes in the anterior neural plate and mesoderm. 

      Comment (7) While the phenotypic difference between gpc4-/- and dact1/2-/- are in the ANC at a later stage, ssRNA-seq was performed using younger embryos. The authors should better explain the rationale and discuss how transcriptomic differences in these younger embryos can explain later phenotypes. Importantly, dact1, dact2, and capn8 expression were not shown in and around the ANC during its development and this information is crucial for interpreting some of the results shown in this paper. For example, if dact1 and dact2 are expressed during ANC development, they may have specific functions during that stage. Alternatively, if dact1 and dact2 are not expressed when the second stream CNCCs are found to be outside the ANC, then the ANC phenotype may be due to dact1/2's functions at an earlier time point. The author's statement in the discussion that "embryonic fields determined during gastrulation effect the CNCC ability to contribute to the craniofacial skeleton" is currently speculative. 

      We have reworded our rationale and hypothesis to increase clarity (Lines 391-405). We believe that the ANC phenotype of the dact1/2 mutants is secondary to defective CE and anterior axis lengthening, as has been reported for the slb mutant (Heisenberg 1997, 2000). We utilized the gpc4 mutant as a foil to the dact1/2 mutant, as the gpc4 mutant has defective CE and axis extension without the same craniofacial phenotype.

      We have added dact1 and dact2 WISH of 24 and 48 hpf (Fig1. D,E) to show expression during ANC development. 

      Comment (8) The functional testing of capn8 did not yield a result that would suggest a strong effect, as only 1 in 142 animals phenocopied dact1/2. Therefore, while the result is interesting, the authors should tone down its importance. Alternatively, the authors can try knocking down capn8 in the dact1/2 mutants to test how that affects the CE phenotype during axis elongation, as well as ANC morphogenesis. 

      As overexpression of capn8 in wildtype animals did not result in a significant phenotype, we tested capn8 overexpression in compound dact1/2 mutants as these have a sensitized background. We found a small but statistically significant effect of exogenous capn8 in dact1+/-,dact2+/- animals. While the effect is not what one would expect comparing to Mendelian genetic ratios, the rod-like ANC phenotype is an extreme craniofacial dysmorphology not observed in wildtype or mRNA injected embryos hence significant. The experiment is limited by the available technology of over-expressing mRNA broadly without temporal or cell specificity control. It is possible that if capn8 over-expression was restricted to specific cells (floor plate, notochord or mesoderm) and at the optimal time period during gastrulation/segmentation that the aberrant ANC phenotype would be more robust. We agree with the reviewer that although the finding of a new role for capn8 during development is interesting, its importance in the context of dact should be toned down and we have altered the manuscript accordingly (Lines 455-467).  

      Comment (9) A difference between the two images in Fig. 8B is hard to distinguish.

      Consider showing flat-mount images. 

      We have added flat-mount images to Fig. 8B

      Minor comments:

      Comment (1) wnt11f2 is spelled incorrectly in a couple of places, e.g. "wnt11f2l" in the abstract and "wntllf2" in the discussion. 

      Revised throughout.

      Comment (2) For Fig. 1D, the white dact1 and yellow dact2 are hard to distinguish in the merged image. Consider changing one of their colors to a different one and only merge dact1 and dact2 without irf6 to better show their complementarity.  

      We agree with the reviewer that the expression patterns of dact1 and dact2 are difficult to distinguish in the merged image. We have added outlines of the cartilage elements to the images to facilitate comparisons of dact1 and dact2 expression (Fig 1F). 

      Comment (3) For Fig. 1E, please label the clusters mentioned in the text so readers can better compare expressions in these cell populations.  

      We have moved this data to supplementary figure S1 and have added labels.

      Comment (4) The citing and labelling of certain figures can be more specific. For example, Fig. S1A, B, and Fig. S1C should be used instead of just Fig. S1 (under the section titled dact1 and dact2 contribute to axis extension...". Similarly, Fig. 4 can be better labeled with alphabets and cited at the relevant places in the text.  

      We have modified the labeling of the figures according to the reviewer’s suggestion (Fig S2 (previously S1), Fig4) and have added reference to these labels in the text (Lines 202, 204, 212, 328, 334, 336). 

      Comment (5) For Fig. 2B, the (+/+,-/-) on x-axis should be (+/-,-/-).  

      Revised in Figure 2B.

      Comment (6) Several figures are incorrectly cited. Fig. 2C is not cited, and the "Fig. 2C" and "Fig. 2D" cited in the text should be "Fig. 2D" and "Fig. 2E" respectively. Similarly, Fig. 5C and D are not cited in the text and the cited Fig. 5C should be 5E. The VC images in Fig. 5 are not talked about in the text. Finally, Fig. 7C was also not mentioned in the text.  

      We have corrected the labeling and have added descriptions of each panel in the Results (Fig.2 Line 231, 237, 242, Fig 5 Line 373, 381, Fig 7 line 431). 

      Comment (7) In the main text, it is indicated that zebrafish at 3ss were used for ssRNAseq, but in the figure legend, it says 4ss. 

      Revised (Line 682)

      Comment (8) No error bars in Fig. S1B and the difference between the black and grey shades in Fig. S1D is not explained.  

      Error bars are not included in the graphs of qPCR results (now Fig S2C) as these are results of a pool of 8 embryos performed one time. We have added a legend to explain the gray vs. black bars (now Fig S2E). 

      Reviewer #3 (Public Review):  

      Weaknesses: The hypotheses are very poorly defined and misinterpret key previous findings surrounding the roles of wnt11 and gpc4, which results in a very confusing manuscript. Many of the results are not novel and focus on secondary defects. The most novel result of overexpressing calpain8 in dact1/2 mutants is preliminary and not convincing.  

      We apologize for not presenting the question more clearly. The Introduction was revised with particular attention to distinguish this work using genetic germline mutants from prior morpholino studies. Please refer to pages 4-5, lines 106-121.

      Weakness 1) One major problem throughout the paper is that the authors misrepresent the fact that wnt11f2 and gpc4 act in different cell populations at different times. Gastrulation defects in these mutants are not similar: wnt11 is required for anterior mesoderm CE during gastrulation but not during subsequent craniofacial development while gpc4 is required for posterior mesoderm CE and later craniofacial cartilage morphogenesis (LeClair et al., 2009). Overall, the non-overlapping functions of wnt11 and gpc4, both temporally and spatially, suggest that they are not part of the same pathway.  

      We have reworded the text to add clarity. While the loss of wnt11 versus the loss of gpc4 may affect different cell populations, the overall effect is a shortened body axis. We stressed that it is this similar impaired axis elongation phenotype but discrepant ANC morphology phenotypes in the opposite ends of the ANC morphologic spectrum that is very interesting and leads us to investigate dact1/2 in the genetic contexts of wnt11f2 and gpc4.  Pls refer to page 4, lines 73-84. Further, the reviewer’s comment that wnt11 and gpc4 are spatially and temporally distinct is untested. We think the reviewer’s claim of gpc4 acting in the posterior mesoderm refers to its requirement in the tailbud (Marlow 2004). However this does not exclude gpc4 from acting elsewhere as well. Further experiments would be necessary. Both wnt11f2 and gpc4 regulate non-canonical wnt signaling and are coexpressed during some points of gastrulation and CF development (Gupta et al., 2013; Sisson 2015). This data supports the possibility of overlapping roles. 

      Weakness 2) There are also serious problems surrounding attempts to relate single-cell data with the other data in the manuscript and many claims that lack validation. For example, in Fig 1 it is entirely unclear how the Daniocell scRNA-seq data have been used to compare dact1/2 with wnt11f2 or gpc4. With no labeling in panel 1E of this figure these comparisons are impossible to follow. Similarly, the comparisons between dact1/2 and gpc4 in scRNA-seq data in Fig. 6 as well as the choices of DEGs in dact1/2 or gpc4 mutants in Fig. 7 seem arbitrary and do not make a convincing case for any specific developmental hypothesis. Are dact1 and gpc4 or dact2 and wnt11 coexpressed in individual cells? Eyeballing similarity is not acceptable.  

      We have moved the previously published Daniocell data to Figure S1 and have added labeling. These data are meant to complement and support the WISH results and demonstrate the utility of using available public Daniocell data. Please recommend how we can do this better or recommend how we can remediate this work with specific comment. 

      Regarding our own scRNA-seq data, we have added rationale (line 391-403) and details of the results to increase clarity (Lines 419-436). We have added a panel to Figure 6 (panel A) to help illustrate or rationale for comparing dact1/2 to gpc4 mutants to wt. The DEGs displayed in Fig.7A are the top 50 most differentially expressed genes between dact1/2 mutants and WT (Figure 7 legend, line 422-424).   

      We have looked at our scRNA-seq gene expression results for our clusters of interest (lateral plate mesoderm, paraxial mesoderm, and ectoderm). We find dact1, dact2, and gpc4 co-expression within these clusters. Knowing whether these genes are coexpressed within the same individual cell would require going back and analyzing the raw expression data. We do not find this to be necessary to support our conclusions. The expression pattern of wnt11f2 is irrelevant here.   

      Weakness 3) Many of the results in the paper are not novel and either confirm previous findings, particularly Waxman et al (2004), or even contradict them without good evidence. The authors should make sure that dact2 loss-of-function is not compensated for by an increase in dact1 transcription or vice versa. Testing genetic interactions, including investigating the expression of wnt11f2 in dact1/2 mutants, dact1/2 expression in wnt11f2 mutants, or the ability of dact1/2 to rescue wnt11f2 loss of function would give this work a more novel, mechanistic angle.

      We clarified here that the prior work carried out by Waxman using morppholinos, while acceptable at the time in 2004, does not meet the rigor of developmental studies today which is to generate germline mutants. The reviewer’s acceptance of the prior work at face value fails to take the limitation of prior work into account. Further, the prior paper from Waxman et al did not analyze craniofacial morphology other than eyeballing the shape of the head and eyes. Please compare the Waxman paper and this work figure for figure and the additional detail of this study should be clear. Again, this is by no means any criticism of prior work as the prior study suffered from the technological limitations of 2004, just as this study also is the best we can do using the tools we have today. Any discrepancies in results are likely due to differences in morpholino versus genetic disruption and most reviewers would favor the phenotype analysis from the germline genetic context. We have addressed these concerns as objectively as we can in the text (Lines 482-493). The fact that dact1/2 double mutants display a craniofacial phenotype while the single mutants do not, suggests compensation (Lines 503-505), but not necessarily at the mRNA expression level (Fig. S2C). 

      This paper tests genetic interaction through phenotyping the wntll/dact1/dact2 mutant.

      Our results support the previous literature that dact1/2 act downstream of wnt11 signaling. There is no evidence of cross-regulation of gene expression. We do not expect that changes in wnt11 or dact would result in expression changes in the others.

      RNA-seq of the dact1/2 mutants did not show changes in wnt11 gene expression. Unless dact1 and/or dact2 mRNA are under expressed in the wnt11 mutant, we would not expect a rescue experiment to be informative. And as wnt11 is not a focus of this paper, we have not performed the experiment.  

      Weakness 4) The identification of calpain 8 overexpression in Dact1/2 mutants is interesting, but getting 1/142 phenotypes from mRNA injections does not meet reproducibility standards.

      As the occurrence of the mutant phenotype in wildtype animals with exogenous capn8 expression was below what would meet reproducibility standards, we performed an additional experiment where capn8 was overexpressed in embryos resulting from dact1/dact2 double heterozygotes incross (Fig. 8). We reasoned that an effect of capn8 overexpression may be more robust on a sensitized background. We found a statistically significant effect of capn8 in dact1/2 double heterozygotes, though the occurrence was still relatively rare (6/80). These data suggest dysregulation of capn8 contributes to the mutant ANC phenotype, though there are likely other factors involved. 

      Comment: The manuscript title is not representative of the findings of this study.  

      We revised the title to strictly describe that we generated and carried out genetic analysis in loss of function compound mutants (Genetic requirement) and that we found capn8 was important which modified this requirement.

      Introduction: p.4:

      Comment: Anterior neurocranium (ANC) - it has to be stated that this refers to the combined ethmoid plate and trabecular cartilages. 

      Thank you, we agree that the ANC and ethmoid plate terminology has been confusing in the literature and we should endeavor to more clearly describe that the phenotypes in question are all in the ethmoid plate and the trabeculae are not affected. ANC has been replaced with ethmoid plate (EP) throughout the manuscript and figures. We also describe that all the observed phenotypes affect the ethmoid plate and not the trabeculae, (pages 13, Lines 265-267).

      Comment: Transverse dimension is incorrect terminology - replace with medio-lateral.

      Revised (Lines 69, 74).

      Comment: Improper way of explaining the relationship between mutant and gene..."Another mutant knypek, later identified as gpc4..." a better  way to explain this would be that the knypek mutation was found to be a non-sense mutation in the gpc4 gene.  

      Revised (Line 71)

      Comment: "...the gpc4 mutant formed an ANC that is wider in the transverse dimension than the wildtype, in the opposite end of the ANC phenotypic spectrum compared to wnt11f2...These observations beg the question how defects in early patterning and convergent extension of the embryo may be associated with later craniofacial morphogenesis."

      This statement is broadly representative of the general failure to distinguish primary from secondary defects in this manuscript. Focusing on secondary defects may be useful to understand the etiology of a human disease, but it is misleading to focus on secondary defects when studying gene function. The rod-like ethmoid of slb mutant results from a CE defect of anterior mesoderm during gastrulation(Heisenberg et al. 1997, 2000), while the wide ethmoid plate of kny mutants results from CE defects of cartilage precursors (Rochard et al., 2016). Based on this evidence, wnt11f2 and gpc4 act in different cell populations at different times.  

      It is true that the slb mutant craniofacial phenotype has been stated as secondary to the CE defect during gastrulation and the kny phenotype as primary to chondrocyte CE defects in the ethmoid, however the direct experimental evidence to conclude only primary or only secondary effects does not yet exist. There is no experiment to our knowledge where wnt11f2 was found to not affect ethmoid chondrocytes directly. Likewise, there is no experiment having demonstrated that dysregulated CE in gpc4 mutants does not contribute to a secondary abnormality in the ethmoid. 

      Here, we are analyzing the CE and craniofacial phenotypes of the dact1/2 mutants without any assumptions about primary or secondary effects and without drawing any conclusions about wnt11f2 or gpc4 cellular mechanisms.     

      Comment: "The observation that wnt11f2 and gpc4 mutants share similar gastrulation and axis extension phenotypes but contrasting ANC morphologies supports a hypothesis that convergent extension mechanisms regulated by these Wnt pathway genes are specific to the temporal and spatial context during embryogenesis."

      This sentence is quite vague and potentially misleading. The gastrulation defects of these 2 mutants are not similar - wnt11 is required for anterior mesoderm CE during gastrulation and has not been shown to be active during subsequent craniofacial development while gpc4 is required for posterior mesoderm CE and craniofacial cartilage morphogenesis (LeClair et al., 2009). Here again, the non-spatially overlapping functions of wnt11 and gpc4 suggest that are not part of the same pathway.  

      Though the cells displaying defective CE in wnt11f2 and gpc4 mutants are different, the effects on the body axis are similar. The dact1/2 showed a similar axis extension defect (grossly) to these mutants. Our aim with the scRNA-seq experiment was to determine which cells and gene programs are disrupted in dact1/2 mutants. We found that some cell types and programs were disrupted similarly in dact1/2 mutants and gpc4 mutants, while other cells and programs were specific to dact1/2 versus gpc4 mutants. We can speculate that these that were specific to dact1/2 versus gpc4 may be attributed to CE in the anterior mesoderm, as is the case for wnt11. 

      p.5

      Comment: "We examined the connection between convergent extension governing gastrulation, body axis segmentation, and craniofacial morphogenesis." A statement focused on the mechanistic findings of this paper would be welcome here, instead of a claim for a "connection" that is vague and hard to find in the manuscript.  

      We have rewritten this statement (Line 125).

      p.7 Results:

      Comment: It is unclear why Farrel et al., 2018 and Lange et al., 2023 are appropriate references for WISH. Please justify or edit.  

      This was a mistake and has been edited (Page 9).

      Comment: " Further, dact gene expression was distinct from wnt11f2." This statement is inaccurate in light of the data shown in Fig1A and the following statements - please edit to reflect the partially overlapping expression patterns.  

      We have edited to clarify (Lines 142-143).

      p.8

      Comment: "...we examined dact1 and 2 expression in the developing orofacial tissues. We found that at 72hpf..." - expression at 72hpf is not relevant to craniofacial morphogenesis, which takes place between 48h-60hpf (Kimmel et al., 1998; Rochard et al., 2016; Le Pabic et al., 2014).  

      We have included images and discussion of dact1 and dact2 expression at earlier time points that are important to craniofacial development (Lines 160-171)(Fig 1D,E). 

      Comment: "This is in line with our prior finding of decreased dact2 expression in irf6 null embryos". - This statement is too vague. How are th.e two observations "in line".  

      We have removed this statement from the manuscript.

      Comment: Incomplete sentence (no verb) - "The differences in expression pattern between dact1 and dact2...".  

      Revised (Line 172).

      Comment: "During embryogenesis..." - Please label the named structures in Fig.1E.

      Please be more precise with the described expression time. Also, it would be useful to integrate the scRNAseq data with the WISH data to create an overall picture instead of treating each dataset separately.  

      We have moved the previously published Daniocell data to supplementary figure S1 and have labeled the key cell types. 

      p.9

      Comment: "The specificity of the gene disruption was demonstrated by phenotypic rescue with the injection of dact1 or dact2 mRNA (Fig. S1)." - please describe what is considered a phenotypic rescue.

      -The body axis reduction of dact mutants needs to be documented in a figure. Head pictures are not sufficient. Is the head alone affected, or both the head and trunk/tail? Fig.2E suggests that both head and trunk/tail are affected - please include a live embryos picture at a later stage.  

      We have added a description of how phenotypic rescue was determined (Line 208). We have added a figure with representative images of the whole body of dact1/2 mutants. Measurements of body length found a shortening in dact1/2 double mutants versus wildtype, however differences were not found to be significantly different by ANOVA (Fig. 3C, Fig. S3, Line 270-275).

      p. 11

      Comment: "These dact1-/-;dact2-/- CE phenotypes were similar to findings in other Wnt mutants, such as slb and kny (Heisenberg, Tada et al., 2000; Topczewski, Sepich et al., 2001)." The similarity between slb and kny phenotypes should be mentioned with caution as CE defects affect different regions in these 2 mutants. It is misleading to combine them into one phenotype category as wnt11 and gpc4 are most likely not acting in the same pathway based on these spatially distinct phenotypes.  

      Here we are referring to the grossly similar axis extension defects in slb and kny mutants. We refer to these mutants to illustrate that dact1 and or 2 deficiency could affect axis extension through diverse mechanisms. We have added text for clarity (Lines 249-252).  

      Comment: "No craniofacial phenotype was observed in dact1 or dact2 single mutants. However, in-crossing to generate [...] compound homozygotes resulted in dramatic craniofacial deformity."

      This result is intriguing in light of (1) the similar craniofacial phenotype previously reported by Waxman et al (2004) using morpholino- based knock-down of dact2, and the phenomenon of genetic compensation demonstrated by Jakutis and Stainier 2001 (https://doi.org/10.1146/annurev-genet-071719-020342). The authors should make sure that dact2 loss-of-function is not compensated for by an increase in dact1 transcription, as such compensation could lead to inaccurate conclusions if ignored.  

      We agree with the reviewer that genetic compensation of dact2 by dact1 likely explains the different result found in the dact2 morphant versus CRISPR mutant. We found increased dact1 mRNA expression in the dact2-/- mutant (Fig S2X) however a more thorough examination is required to draw a conclusion. Interestingly, we found that in wildtype embryos dact1 and dact2 expression patterns are distinct though with some overlap. It would be informative to investigate whether the dact1 expression pattern changes in dact2-/- mutants to account for dact2 loss.   

      Comment: "Lineage tracing of NCC movements in dact1/2 mutants reveals ANC composition" - the title is misleading - ANC composition was previously investigated by lineage tracing (Eberhardt et al., 2006; Wada et al., 2005).  

      This has been reworded (Line 292)

      p.13

      Comment: There is no frontonasal prominence in zebrafish.  

      This is true, texts have been changed to frontal prominence.  (Lines 293,

      299, 320)

      Comment: The rationale for investigating NC migration in mutants where there is a gastrula-stage failure of head mesoderm convergent extension is unclear. The whole head is deformed even before neural crest cells migrate as the eye field does not get split in two (Heisenberg et al., 1997; 2000), suggesting that the rod-like ethmoid plate is a secondary defect of this gastrula-stage defect. In addition, neural crest migration and cartilage morphogenesis are different processes, with clear temporal and spatial distinctions.  

      We carried out the lineage tracing experiment to determine which NC streams contributed to the aberrantly shaped EP, whether the anteromost NC stream frontal prominence, the second NC stream of maxillary prominence, or both.  We found that the anteromost NCC did contribute to the rod-like EP, which is different from when hedgehod signaling is disrupted,  So while it is possible that the gastrula-effect head mesoderm CE caused a secondary effect on NC migration, how the anterior NC stream and second NC stream are affected differently between dact1/2 and shh pathway is interesting.  We added discussion of this observation to the manuscript (page 23, Lines 514-520). 

      p. 14-16

      Comment: Based on the heavy suspicion that the rod-like ethmoid plate of the dact1/2 mutant results from a gastrulation defect, not a primary defect in later craniofacial morphogenesis, the prospect of crossing dact1/2 mutants with other wnt-pathway mutants for which craniofacial defects result from craniofacial morphogenetic defects is at the very least unlikely to generate any useful mechanistic information, and at most very likely to generate lots of confusion. Both predictions seem to take form here.  

      However, the ethmoid plate phenotype observed in the gpc4-/-; dact1+/-; dact2-/- mutants (Fig. 5E) does suggest that gpc4 may interact with dact1/2 during gastrulation, but that is the case only if dact1+/-; dact2-/- mutants do not have an ethmoid cartilage defect, which I could not find in the manuscript. Please clarify.  

      The perspective that the rod-like EP of the dact1/2 is due to gastrulation defect is being examined here. Why would other mutants such as wnt11f2 and gpc4 that have gastrulation CE defects have very different EP morphology, whether primary or secondary NCC effect?  Further dact1 and dact2 were reported as modifiers of Wnt signaling, so it is logical to genetically test the relationship between dact1, dact2, wnt11f2, gpc4 and wls. The experiment had to be done to investigate how these genetic combinations impact EP morphology. This study found that combined loss of dact1, dact2 and wls or gpc4 yielded new EP morphology different than those previously observed in either dact1/2, wls, gpc4, or any other mutant is important, suggesting that there are distinct roles for each of these genes contributing to facial morphology, that is not explained by CE defect alone.   

      Comment: I encourage the authors to explore ways to test whether the rod-like ethmoid of dact1/2 mutants is more than a secondary effect of the CE failure of the head mesoderm during gastrulation. Without this evidence, the phenotypes of dact1/2 -gpc4 or - wls are not going to convince us that these factors actually interact.  

      Actually, we find our results to support the hypothesis that the ethmoid of the dact1/2 mutants is a secondary effect of defective gastrulation and anterior extension of the body axis. However, our findings suggest (by contrasting to another mutant with impaired CE during gastrulation) that this CE defect alone cannot explain the dysmorphic ethmoid plate. Our single-cell RNA seq results and the discovery of dysregulated capn8 expression and proteolytic processes presents new wnt-regulated mechanisms for axis extension.    

      p. 20 Discussion

      Comment: "Here we show that dact1 and dact2 are required for axis extension during gastrulation and show a new example of CE defects during gastrulation associated with craniofacial defects."

      Waxman et al. (2004) previously showed that dact2 is involved in CE during gastrulation.

      Heisenberg et al. (1997, 2000), previously showed with the slb mutant how a CE defect during gastrulation causes a craniofacial defect.  

      The Waxman paper using morpholino to disrupt dact2 is produced limited analysis of CE and no analysis of craniofacial morphogenesis. We generated genetic mutants here to validate the earlier morpholino results and to analyze the craniofacial phenotype in detail. We have removed the word “new” to make the statement more clear (Line 475).

      Comment: "Our data supports the hypothesis that CE gastrulation defects are not causal to the craniofacial defect of medially displaced eyes and midfacial hypoplasia and that an additional morphological process is disrupted."

      It is unclear to me how the authors reached this conclusion. I find the view that medially displaced eyes and midfacial hypoplasia are secondary to the CE gastrulation defects unchallenged by the data presented. 

      This statement was removed and the discussion was reworded.

      Comment: The discussion should include a detailed comparison of this study's findings with those of zebrafish morpholino studies.  

      We have added more discussion to compare ours to the previous morpholino findings (Lines 476-484).

      Comment: The discussion should try to reconcile the different expression patterns of dact1 and dact2, and the functional redundancy suggested by the absence of phenotype of single mutants. Genetic compensation should be considered (and perhaps tested).  

      The different expression patterns of dact1 and dact2 along with our finding that dact1 and dact2 genetic deficiency differently affect the gpc4 mutant phenotype suggest that dact1 and dact2 are not functionally redundant during normal development. This is in line with the previously published data showing different phenotypes of dact1 or dact2 knockdown. However, our results that genetic ablation of both dact1 and dact2 are required for a mutant phenotype suggests that these genes can compensate upon loss of the other. This would suggest then that the expression pattern of dact1 would be changed in the dact2 mutant and visa versa. We find that this line of investigation would be interesting in future studies. We have addressed this in the Discussion (Lines 485498).

      Comment: "Based on the data...Conversely, we propose...ascribed to wnt11f2 "

      Functional data always prevail overexpression data for inferring functional requirements.  

      This is true.

      p.21

      Comment: "Our results underscore the crucial roles of dact1 and dact2 in embryonic development, specifically in the connection between CE during gastrulation and ultimate craniofacial development."

      How is this novel in light of previous studies, especially by Waxman et al. (2004) and Heisenberg et al. (1997, 2000). In this study, the authors fail to present compelling evidence that craniofacial defects are not secondary to the early gastrulation defects resulting from dact1/2 mutations.  p. 22

      We have not claimed that the craniofacial defects are not secondary to the gastrulation defects. In fact, we state that there is a “connection”. Further, we do not claim that this is the first or only such finding. We believe our findings have validated the previous dact morpholino experiments and have contributed to the body of literature concerning wnt signaling during embryogenesis. 

      Comment: The section on Smad1 discusses a result not reported in the results section. Any data discussed in the discussion section needs to be reported first in the results section.  

      We have added a comment on the differential expression of smad1 to the results section (Lines 446-448).

    1. eLife assessment

      This important study utilizes the nematode C. elegans and mammalian cell culture to investigate the role of MML-1/Mondo in conserved regulation of metabolism and aging. The evidence supporting the conclusions is convincing and covers a range of areas including localization, upstream pathways, and conservation. The paper will be of interest to a broad range of biologists studying aging, metabolism, and transcriptional regulation.

    2. Reviewer #1 (Public Review):

      In this manuscript, Laboy and colleagues investigated upstream regulators of MML-1/Mondo, a key transcription factor that regulates aging and metabolism, using the nematode C. elegans and cultured mammalian cells. By performing a targeted RNAi screen for genes encoding enzymes in glucose metabolism, the authors found that two hexokinases, HXK-1 and HXK-2, regulate nuclear localization of MML-1 in C. elegans. The authors showed that knockdown of hxk-1 and hxk-2 suppressed longevity caused by germline-deficient glp-1 mutations. The authors demonstrated that genetic or pharmacological inhibition of hexokinases decreased nuclear localization of MML-1, via promoting mitochondrial β-oxidation of fatty acids. They found that genetic inhibition of hxk-2 changed the localization of MML-1 from the nucleus to mitochondria and lipid droplets by activating pentose phosphate pathway (PPP). The authors further showed that the inhibition of PPP increased the nuclear localization of mammalian MondoA in cultured human cells under starvation conditions, suggesting the underlying mechanism is evolutionarily conserved. This paper provides compelling evidence for the mechanisms by which novel upstream metabolic pathways regulate MML-1/Mondo, a key transcription factor for longevity and glucose homeostasis, through altering organelle communications, using two different experimental systems, C. elegans and mammalian cells. This paper will be of interest to a broad range of biologists who work on aging, metabolism, and transcriptional regulation.

    3. Reviewer #2 (Public Review):

      Raymond Laboy et.al explored how transcriptional Mondo/Max-like complex (MML-1/MXL-2) is regulated by glucose metabolic signals using germ-line removal longevity model. They believed that MML-1/MXL-2 integrated multiple longevity pathways through nutrient sensing and therefore screened the glucose metabolic enzymes that regulated MML-1 nuclear localization. Hexokinase 1 and 2 were identified as the most vigorous regulators, which function through mitochondrial beta-oxidation and the pentose phosphate pathway (PPP), respectively. MML-1 localized to mitochondria associated with lipid droplets (LD), and MML-1 nuclear localization was correlated with LD size and metabolism. Their findings are interesting and may help us to further explore the mechanisms in multiple longevity models. The data support their proposed working model. Nonetheless, the roles of hxk-1 and lipid oxidation in regulating LD, as proposed in the working model, are not clear.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      In this manuscript entitled "Hexokinase regulates Mondo-mediated longevity via the PPP and organellar dynamics", Laboy and colleagues investigated upstream regulators of MML-1/Mondo, a key transcription factor that regulates aging and metabolism, using the nematode C. elegans and cultured mammalian cells. By performing a targeted RNAi screen for genes encoding enzymes in glucose metabolism, the authors found that two hexokinases, HXK-1 and HXK-2, regulate nuclear localization of MML-1 in C. elegans. The authors showed that knockdown of hxk-1 and hxk-2 suppressed longevity caused by germline-deficient glp-1 mutations. The authors demonstrated that genetic or pharmacological inhibition of hexokinases decreased nuclear localization of MML-1, via promoting mitochondrial β-oxidation of fatty acids. They found that genetic inhibition of hxk-2 changed the localization of MML-1 from the nucleus to mitochondria and lipid droplets by activating pentose phosphate pathway (PPP). The authors further showed that the inhibition of PPP increased the nuclear localization of mammalian MondoA in cultured human cells under starvation conditions, suggesting the underlying mechanism is evolutionarily conserved. This paper provides compelling evidence for the mechanisms by which novel upstream metabolic pathways regulate MML-1/Mondo, a key transcription factor for longevity and glucose homeostasis, through altering organelle communications, using two different experimental systems, C. elegans and mammalian cells. This paper will be of interest to a broad range of biologists who work on aging, metabolism, and transcriptional regulation. 

      Reviewer #2 (Public Review):

      Raymond Laboy et.al explored how transcriptional Mondo/Max-like complex (MML-1/MXL-2) is regulated by glucose metabolic signals using germ-line removal longevity model. They believed that MML-1/MXL-2 integrated multiple longevity pathways through nutrient sensing and therefore screened the glucose metabolic enzymes that regulated MML-1 nuclear localization. Hexokinase 1 and 2 were identified as the most vigorous regulators, which function through mitochondrial beta-oxidation and the pentose phosphate pathway (PPP), respectively. MML-1 localized to mitochondria associated with lipid droplets (LD), and MML-1 nuclear localization was correlated with LD size and metabolism. Their findings are interesting and may help us to further explore the mechanisms in multiple longevity models, however, the study is not complete and the working model remains obscure. For example, the exact metabolites that account for the direct regulation of MML-1 were not identified, and more detailed studies of the related cellular processes are needed. 

      The identification of responsible metabolites is necessary since multiple pieces of evidence from the study suggests that lipid other than glucose metabolites may be more likely to be the direct regulator of MML-1 and HXK regulate MML-1 indirectly by affecting the lipid metabolism: 1) inhibiting the PPP is sufficient to rescue MML-1 function independent of G6P levels; 2) HXK-1 regulates MML-1 by increasing fatty acid beta-oxidation; 3) LD size correlates with MML-1 nuclear localization and LD metabolism can directly regulate MML-1. The identification of metabolites will be helpful for understanding the mechanism. 

      Beta-oxidation and the PPP are involved in the regulation of MML-1 by HXK-1 and HXK-2, respectively. But how these two pathways participate in the regulation is not clear. Is it the beta-oxidation rate or the intermediate metabolites that matters? As for the PPP, it provides substrates for nucleotide synthesis and also its product NADPH is essential for redox balance. Is one of the metabolites or the NADPH levels involved in MML-1 regulation? More studies are needed to provide answers to these concerns. 

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Following are my comments that the authors may want to address to further improve this excellent paper.

      Major comments 

      (1) Although the authors provided evidence that hexokinases in glucose metabolism are associated with germline-deficient glp-1(-) mutants, they did not mention why they focused on glp-1(-) mutants rather than other longevity mutants. In their previous study (Nakamura et al., 2016), they showed that MML-1 is required for multiple longevity pathways in C. elegans, including reduced mitochondrial respiration and insulin/IGF-1 signaling. Please discuss why the authors focused on glp-1(-) mutants in this paper. It will be even better if the authors test the roles of hexokinases in some other longevity regimens. 

      Many thanks for this astute comment. Previously we had shown that mml-1 is required for glp-1, daf-2, and isp-1 longevity, and Johnson et al. had shown a requirement for eat-2, hence the idea that MML-1 is a convergent transcription factor. We first focused on glp-1 because that was the starting point of our screen, and the result was clear and simple: hexokinases regulate MML‑1 nuclear localization and activity in glp-1 and are required for longevity. Naturally, the question arises: do hexokinases behave like MML-1 as convergent longevity regulators across pathways? To address this, we examined the interaction of hxk-1 and hxk-2 with isp-1, daf-2, and raga-1.  Specifically, we now show that:

      A. Like glp-1(e2141) mutants, isp-1(qm150) mutants stimulate MML-1 nuclear localization, and the hexokinases are required for isp-1 longevity (Figure 1G-H).

      B. daf-2(e1370) mutants do not further stimulate MML-1 nuclear localization beyond basal levels, yet MML-1 is strongly required for daf-2 longevity (Nakamura et al., 2016, Supplementary Figure 1L-M). However, the hexokinases are not required for daf-2 longevity (Supplementary Figure 1M), suggesting that the signaling pathway is wired differently in daf-2, and that other pathways regulate MML-1 activity.

      C. raga-1(ok701) mutants stimulate MML-1 nuclear localization and mml-1 is required for raga-1 longevity, suggesting that MML-1 acts downstream of TORC1 signaling (Supplementary Figure 1N-O). However, hexokinases are not required for raga-1 longevity, suggesting that raga-1 acts downstream or parallel to hexokinase signaling (Supplementary Figure 1P).

      D. We performed untargeted metabolomics in glp-1, daf-2, and mml-1 single and double mutants and observed that hexose phosphates, which have been shown to regulate MML-1 human homologs MondoA/ChREBP, were differentially regulated between mutants.

      Author response image 1.

      E. Altogether these experiments reveal that though MML-1 promotes longevity in most pathways, the hexokinases are only required in some (glp-1, isp-1), but not others (raga-1, daf-2). Furthermore, strong MML-1 nuclear localization is often but not always associated with longevity (e.g. daf-2), and the wiring of the signaling pathway is different for various longevity regimens. Consistently, mTOR and Insulin signaling are more functionally linked and therefore may show a more similar genetic profile. Differences in hexose phosphate between glp-1 and daf-2 could explain why MML-1 requires hexokinase function in glp-1 to promote longevity but not in daf-2. However, considerably more work is required to rigorously validate this hypothesis.

      (2) In figure 5, the authors investigated whether the association between PPP and MML‑1/MondoA, tested in C. elegans, is conserved in mammals under starvation conditions. The authors should clarify why they tested the MondoA localization upon starvation in cultured human cells. This comment is related to my comment #1 as the authors could determine the roles of hexokinases under dietary restriction (DR)-conditions or in DR-mimetic in eat-2(-) mutants. 

      In this case, the actual translatability to a worm longevity pathway was not our goal. Rather, we examined MondoA in cell culture under contrasting conditions of MondoA subcellular localization, where high glucose media had cytosolic/nuclear localization and starvation conditions cytosolic localization. We then showed that similar to our data in worms, PPP inhibition with 6-AN induced MondoA nuclear localization and activity. We now mention this rationale in the results section, lines 352-356.

      (3) In figure 2, the authors showed that HXK-2 regulates mitochondrial localization of MML-1, and HXK-1 regulates nuclear localization of MML-1 through mitochondrial β-oxidation in glp‑1(-) mutants. Can the authors test whether mitochondrial β-oxidation affects the effects of hxk RNAi on longevity of glp-1(-) mutants? 

      Excellent suggestion. We tried to test this idea and found that acs-2 RNAi alone abolished glp-1 longevity, making epistasis experiments difficult to interpret. This is consistent with published data showing that glp-1 longevity requires NHR-49, a transcription factor that regulates mitochondrial b‑oxidation, that drives acs-2 expression (Ratnappan et al., 2014). It could well be that b‑oxidation inhibition promotes MML-1 nuclear localization but abolishes lifespan extension because of epistatic effects on other transcription factors or processes. Further investigation would be required to elucidate the exact mechanism that goes beyond the scope of the paper.

      (4) The authors showed that 2-deoxy-glucose, which decreases the activity of HXK, decreased the nuclear localization of MML-1, and this is consistent with their genetic data. Based on these data, 2-deoxy-glucose is expected to decrease longevity. Interestingly, however, 2-deoxy-glucose has been reported to increase lifespan by restricting glucose, whereas extra glucose intake decreases lifespan in C. elegans, shown by multiple research groups, including M. Ristow, C. Kenyon, and S.J.V. Lee labs. This is seemingly paradoxical and worth discussing with key references, especially because MondoA and Chrebp are known as glucose-responsive transcription factors. 

      Thank you for this important comment. 2-DG has been shown to extend lifespan by suppressing glucose metabolism at concentrations ranging from 0.1 to 5 mM, higher concentrations ranging from 20 to 50 mM had the opposite effect decreasing lifespan (Schulz et al., 2007). The concentration we tested was 50 mM 2-DG and observed decreased MML-1 nuclear localization, which is consistent with the previous data showing decreased longevity. We now raise this point in the discussion suggesting that mild inhibition of glucose metabolism has beneficial effects on longevity, while strong suppression causes a shortening of the lifespan (lines 411-414).

      Minor comments 

      (1) The current Introduction does not include the explicit statement about that MML-1 and MondoA are homologs. Please clarify this as naive readers may be confused.

      Thank you for pointing this out. We now say in the intro that MondoA and MML-1 are homologs (lines 59-60).

      (2) In figure 1, the effects of hxk-3 on nuclear localization of MML-1 is small compared to those of hxk-1 and hxk-2. Please add speculation about why HXK-3 has different roles in nuclear localization of MML-1 compared to HXK-1 and HXK-2. 

      According to GExplore 1.4 (Hutter & Suh, 2016), hxk-3 expression declines during larval development and is low expressed in the adult. Perhaps it has little effect in the young adult, and the other hexokinases suffice to support MML-1 nuclear localization. It also remains possible that hxk-3 is not required in glp-1, but required in other longevity pathways.

      (3) The authors tested the effects of genetic inhibition of hxk-1 and hxk-2 on the regulation of MML-1 localization and lifespan of glp-1(-) mutants by using RNAi. I wonder whether the authors can perform the experiments with hxk-1 or hxk-2 loss (or reduction) of function mutants. If they cannot, please discuss the reason and the limitations of RNAi. 

      This is an important point raised by the reviewer. We found that RNAi was most effective for phenotypes related to MML-1 nuclear localization and longevity, likely because it results in acute knockdown. We also showed that pharmacological inhibition of hexokinase function with 3BrP and 2‑DG (Supplementary Figure 1B and 1C) and the PPP with 6-AN (Figure 3B) had consistent results with our observation with RNAi.

      We generated hexokinase KO mutants by deleting the coding sequence of each hexokinase by CRISPR/Cas9. First, we measured the expression of each hexokinase isozyme in each mutant. Notably, hxk-1(syb1271) null mutant had higher expression of hxk-2 and hxk-3, hxk-2(syb1261) did not significantly affect the expression of hxk-1 and hxk-3, and hxk-3(syb1267) had a mild increase in hxk-2 expression. We followed up on the hxk-1(syb1271) and hxk-2(syb1261) and crossed these mutants with our MML-1::GFP reporter. We observed a modest but significant reduction in MML-1 nuclear localization in both strains. The effect with RNAi is much stronger in comparison to the null mutants, potentially due to a compensatory upregulation of the other hexokinases in the mutants that we do not observe with RNAi (Supplementary Figure 1D-E). Another alternative is that there is a threshold in the effects of hexokinase function on MML-1 nuclear localization. We tried to generate a hxk-1; hxk-2 double mutant but it was lethal and therefore did not pursue this further.

      Author response image 2.

      (4) Please correct minor typos throughout the manuscript. Following are some examples. <br /> - On page 4, line 111, please correct "Supplementary Figure D-E" to "Supplementary Figure 1D-E". 

      - On page 9, line 272, please correct "3A-B" to "4A-B". 

      - On page 9, line 275, please correct "S4" to "4". 

      - On page 10, line 309, please correct "4A" to "4B" 

      Corrected.

      (5) In Fig. 3E, please add the information about the scale bars in figure legends.

      Corrected.

      Reviewer #2 (Recommendations For The Authors):

      Here are some detailed suggestions for the authors:

      (1) Since MML-1/MXL-2 complex functions in multiple longevity models, e.g. DR, ILS, what are the roles of HXK-1 and HXK-2 in these models? 

      We now show that although mml-1 is required in most longevity pathways, hxk-1 and hxk-2 are required in some pathways (glp-1, isp-1) but not others (daf-2, raga-1). See above for more details.

      (2) As for the metabolites screening, the lipid metabolic genes can be included. Not only for the above reasons, also previous study had found that the mml-1 mRNA levels and MML-1 GFP nuclear localization were all increased in the glp-1 model, while mml-1 mRNA levels were unaffected by hxk knockdown, suggesting more pathways be involved. 

      We agree with the reviewer that understanding what metabolites regulate MML-1 nuclear localization and activity is an important, yet challenging question. Our studies demonstrate a role of glucose metabolism, in particular, hexokinase in this process, consistent with hexose-p being activators of MondoA. Our data also suggest mechanisms beyond hexose-p regulate MML-1, since knockdown of the PPP components stimulates MML-1 even when hxk-2 is depleted and low G6P, and inhibition of the PPP with 6-AN stimulates MondoA nuclear localization under starvation conditions in mammalian cell culture. We tested redox regulation, nucleoside, and lipid metabolism as candidate processes (see below). Notably, our data suggest this other mechanism is tied to lipid metabolism through droplet size since various perturbations that impact LD size and number (atgl-1, dgat-2, tkt-1, Figure 4) affected MML-1 nuclear localization. It remains an open question whether MML-1 is regulated by other metabolites through a ligand-protein interaction or not. We cannot exclude that beyond lipid droplet regulation, specific lipids, other metabolites, or metabolic modules linked to the PPP might regulate MML-1 nuclear localization and activity.

      We employed genetic manipulation and pharmacological inhibition to understand the upstream signals that regulate MML-1. These approaches will not be sufficient to determine whether other metabolite(s) are involved in MML-1/MondoA translocation to the nucleus through a direct interaction. Novel technologies that determine protein-metabolite interactions (e.g. MIDAS) will help us answer this question in future work, and go beyond the scope of this paper. As a compromise, we discuss possible metabolites that may orchestrate this based on our observations based on MML‑1 subcellular localization at LD/mitochondria (including PPP and TCA cycle intermediates).

      (3) Line 238, it should be "NADPH". 

      Corrected.

      (4) RNAi targeting enzymes of different branches of PPP can be performed

      In our initial screen, we examined the effect of various enzymes of the PPP on MML-1 nuclear localization (Figure 1A, Supplementary Table S1) and found that knockdown of enzymes in both the oxidative phase (PGDH/T25B9.9) and non-oxidative phase (transketolase/TKT-1) affect MML-1 nuclear localization. In line, 6-AN treatment, which affects the oxidative phase, also stimulated MML‑1 nuclear localization (Figure 3B). We also observed that knockdown of enzymes involved in ribose 5P conversion to ribose, ribose 1P, and phosphoribosyl pyrophosphate, an intermediate in nucleotide biosynthesis, decreased MML-1 nuclear localization (rpia-1, F07A11._5, _Y43F4B.5, _R151._2; Supplementary Table S1). Whether MML‑1/MondoA responds to nucleotide pool remains elusive.

      (5) As for PPP, these are many possibilities that can be tested. For example, as PPP supplies NADPH for oxidative balance, does MML-1 respond to ROS? Also, it appears the genes in the non-oxidative arm of PPP regulate MML-1, so is nucleotide synthesis involved? 

      Thank you for the suggestion. We tested other enzymes involved in NADPH production from the folate cycle and observed a mild but significant reduction of MML-1 nuclear localization upon dao-3i (Supplementary Table S1). Moreover, we tested whether MML-1 nuclear localization is responsive to ROS. While paraquat exposure induced oxidative stress by measuring the transcriptional reporter gst‑4p::GFP (Supplementary Figure 3A), paraquat exposure did not significantly affect MML-1 nuclear localization (Supplementary Figure 3B). Therefore we think it less likely that NADPH production acting through redox regulation is the main effect.

      We also tried supplementation with some of the metabolite outputs of PPP including ribose, ribulose, and xylulose, as well as nucleosides (see below), but saw no effect on MML-1 nuclear localization. We agree that further studies are required to pinpoint whether there is another metabolic moiety regulating MML-1 at the protein-ligand level, but this goes beyond the scope of the current investigation.

      Author response image 2.

    1. eLife assessment

      This fundamental study reports the deep evolutionary conservation of a core genetic program regulating spermatogenesis in flies, mice, and humans. Convincing data were presented and supported the main conclusion. This work will be of interest to evolutionary and reproductive biologists.

    2. Reviewer #1 (Public Review):

      Summary:

      By combining an analysis of the evolutionary age of the genes expressed in male germ cells, a study of genes associated with spermatocyte protein-protein interaction networks and functional experiments in Drosophila, Brattig-Correia and colleagues provide evidence for an ancient origin of the genetic program underlying metazoan spermatogenesis. This leads to the identification of a relatively small core set of functional interactions between deeply conserved gene expression regulators, whose impairment is then shown to be associated with cases of human male infertility.

      Strengths:

      In my opinion, the work is important for three different reasons. First, it shows that, even though reproductive genes can evolve rapidly and male germ cells display a significant level of transcriptional noise, it is still possible to obtain convincing evidence that a conserved core of functionally interacting genes lies at the basis of the male germ transcriptome. Second, it reports an experimental strategy that could also be applied to gene networks involved in different biological problems. Third, the authors make a compelling case that, due to its effects on human spermatogenesis, disruption of the male germ cell orthoBackbone can be exploited to identify new genetic causes of infertility.

      Weaknesses:

      The main strength of the general approach followed by the authors is, inevitably, also a weakness. This is because a study rooted in comparative biology is unlikely to identify newly emerged genes that may adopt key roles in processes such as, for example, species-specific gamete recognition. Additionally, the use of a TPM >1 threshold for protein-coding transcripts - which, as the authors pointed out, was a necessary compromise due to the high transcriptional noise of the system under study - may exclude genes, such as those encoding proteins required for gamete fusion, which are thought to be expressed at a very low level. Although these considerations raise the possibility that the chosen approach may miss information that, depending on the species, could be potentially highly functionally important, this by no means reduces its value in identifying genes belonging to the conserved genetic program of spermatogenesis. Moreover, as mentioned in the Discussion, future variations of the pipeline described in the manuscript may allow us to extend the reach of the present analysis.

    3. Reviewer #2 (Public Review):

      Summary:

      This is a tour de force study that aims to understand the genetic basis of male germ cell development across three animal species (human, mouse and flies) by performing a genetic program conservation analysis (using phylostratigraphy and network science) with a special emphasis on genes that peak or decline during mitosis-to-meiosis. This analysis, in agreement with previous findings, reveals that several genes active during and before meiosis are deeply conserved across species, suggesting ancient regulatory mechanisms. To identify critical genes in germ cell development, the investigators integrated clinical genetics data, performing gene knockdown and knockout experiments in both mice and flies. Specifically, over 900 conserved genes were investigated in flies, with three of these genes further studied in mice. Of the 900 genes in flies, ~250 RNAi knockdowns had fertility phenotypes. The fertility phenotypes for the fly data can be viewed using the following browser link: https://pages.igc.pt/meionav. The scope of target gene validation is impressive. Below are a few minor comments.

      (1) In Supplemental Figure 2, it is notable that enterocyte transcriptomes are predominantly composed of younger genes, contrasting with the genetic age profile observed in brain and muscle cells. This difference is an intriguing observation and it would be curious to hear author comments.

      (2) Regarding the document, the figures provided only include supplemental data; none of the main text figures are in the full PDF.

      (3) Lastly, it would be great to section and stain mouse testis to classify the different stages of arrest during meiosis for each of the mouse mutants in order to compare more precisely to flies.

      This paper serves as a vital resource, emphasizing that only through the analysis of hundreds of genes can we prioritize essential genes for germ cell development. its remarkable that about 60% of conserved genes have no apparent phenotype during germ cell development.

      Strengths:

      High-throughput screening was conducted on a conserved network of 920 genes expressed during the mitosis-to-meiosis transition. Approximately 250 of these genes were associated with fertility phenotypes. Notably, mutations in 5 of the 250 genes have been identified in human male infertility patients. Furthermore, 3 of these genes were modeled in mice, where they were also linked to infertility. This study establishes a crucial groundwork for future investigations into germ cell development genes, aiming to delineate their essential roles and functions.

      Weaknesses:

      The fertility phenotyping in this study is limited, yet dissecting the mechanistic roles of these proteins falls beyond its scope. Nevertheless, this work serves as an invaluable resource for further exploration of specific genes of interest.

    4. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment:

      This important study reports the deep evolutionary conservation of a core genetic program regulating spermatogenesis in flies, mice, and humans. The data presented are supportive of the main conclusion and generally convincing. This work will be of interest to evolutionary and reproductive biologists.

      The Authors would like to thank the Senior Editor and the two Reviewers for their positive assessment of our work, as well as for the helpful suggestions. Collectively, these suggestions provided insight that was instrumental in shaping the final version of the manuscript (see below for our point-by-point comments). The Authors believe that the refinements introduced to the final document clearly translate into an improved version of our work. Hence, we would like to thank all those involved in the peer review process for their encouraging words and constructive criticism.

      Public Reviews: 

      Reviewer #1 (Public Review):

      Summary: 

      By combining an analysis of the evolutionary age of the genes expressed in male germ cells, a study of genes associated with spermatocyte protein-protein interaction networks and functional experiments in Drosophila, Brattig-Correia and colleagues provide evidence for an ancient origin of the genetic program underlying metazoan spermatogenesis. This leads to identifying a relatively small core set of functional interactions between deeply conserved gene expression regulators, whose impairment is then shown to be associated with cases of human male infertility.

      Strengths: 

      In my opinion, the work is important for three different reasons. First, it shows that, even though reproductive genes can evolve rapidly and male germ cells display a significant level of transcriptional noise, it is still possible to obtain convincing evidence that a conserved core of functionally interacting genes lies at the basis of the male germ transcriptome. Second, it reports an experimental strategy that could also be applied to gene networks involved in different biological problems. Third, the authors make a compelling case that, due to its effects on human spermatogenesis, disruption of the male germ cell orthoBackbone can be exploited to identify new genetic causes of infertility.

      We thank the Reviewer for their positive assessment. Indeed, it was our main objective to convincingly demonstrate these three points.

      Weaknesses: 

      The main strength of the general approach followed by the authors is, inevitably, also a weakness. This is because a study rooted in comparative biology is unlikely to identify newly emerged genes that may adopt key roles in processes such as species-specific gamete recognition. Additionally, using a TPM >1 threshold for protein-coding transcripts may exclude genes, such as those encoding proteins required for gamete fusion, which are thought to be expressed at a very low level. Although these considerations raise the possibility that the chosen approach may miss information that, depending on the species, could be potentially highly functionally important, this by no means reduces its value in identifying genes belonging to the conserved genetic program of spermatogenesis.

      The Authors acknowledge the points raised by the Reviewer as inevitable trade-offs of the focus of our study (to uncover the deeply conserved genetic basis of spermatogenesis). Certainly, our pipeline could, in the future, be adapted to look for newly emerged genes or to employ different minimum expression cut-offs. To this end, we made all computational data and custom scripts easily available to the community. We would, nevertheless, kindly emphasize the challenge associated with the use of less restrictive TPM cut-offs, given the substantial level of transcriptional noise associated with this cell type. An abridged version of this discussion can be found in lines 512-515 of the manuscript.

      Reviewer #2 (Public Review):

      Summary: 

      This is a tour de force study that aims to understand the genetic basis of male germ cell development across three animal species (human, mouse, and flies) by performing a genetic program conservation analysis (using phylostratigraphy and network science) with a special emphasis on genes that peak or decline during mitosis-to-meiosis. This analysis, in agreement with previous findings, reveals that several genes active during and before meiosis are deeply conserved across species, suggesting ancient regulatory mechanisms. To identify critical genes in germ cell development, the investigators integrated clinical genetics data, performing gene knockdown and knockout experiments in both mice and flies. Specifically, over 900 conserved genes were investigated in flies, with three of these genes further studied in mice. Of the 900 genes in flies, ~250 RNAi knockdowns had fertility phenotypes. The fertility phenotypes for the fly data can be viewed using the following browser link:https://pages.igc.pt/meionav. The scope of target gene validation is impressive. Below are a few minor comments.

      We thank the Reviewer for their positive appraisal of our work.

      (1) In Supplemental Figure 2, it is notable that enterocyte transcriptomes are predominantly composed of younger genes, contrasting with the genetic age profile observed in brain and muscle cells. This difference is an intriguing observation and it would be curious to hear the author's comments.

      Indeed, this is an intriguing observation for which we can only provide a speculative answer. Enterocytes are specialized to absorb nutrients, hence their genetic program is finely tuned to maximize uptake under specific dietary conditions. In this regard, we can posit that variations in nutrient preference/availability in the course of each species’ evolutionary history (associated with habitat, environmental and/or behavioral changes) may have exerted a selective pressure for the emergence of new genes that could provide enterocytes with more efficient uptake capabilities under new circumstances. The application of evolutionary thinking to the rapidly expanding field of nutrigenomics could shed light on this possibility.

      (2) Regarding the document, the figures provided only include supplemental data; none of the main text figures are in the full PDF. 

      We thank the Reviewer for this helpful comment. We will ensure that the three main figures are correctly formatted in the final version of the manuscript.

      (3) Lastly, it would be great to section and stain mouse testis to classify the different stages of arrest during meiosis for each of the mouse mutants in order to compare more precisely to flies.

      We agree with the Reviewer that adding more mouse data would further improve what can already be considered an extensive body of experimental work. Given the costs associated with the generation of such data (in terms of resources and otherwise), the Authors believe such a study would be best suited to a follow-up manuscript.

      This paper serves as a vital resource, emphasizing that only through the analysis of hundreds of genes can we prioritize essential genes for germ cell development. its remarkable that about 60% of conserved genes have no apparent phenotype during germ cell development.

      Once again, we thank the Reviewer for their positive assessment of our work. Clarifying the degree of functional redundancy in an essential biological process such as male gametogenesis represents an exciting (and experimentally complex) future challenge.

      Strengths:

      The high-throughput screening was conducted on a conserved network of 920 genes expressed during the mitosis-to-meiosis transition. Approximately 250 of these genes were associated with fertility phenotypes. Notably, mutations in 5 of the 250 genes have been identified in human male infertility patients. Furthermore, 3 of these genes were modeled in mice, where they were also linked to infertility.

      This study establishes a crucial groundwork for future investigations into germ cell development genes, aiming to delineate their essential roles and functions.

      The Authors thank the Reviewer for emphasizing the potential usefulness of our results to the community, as that was one of the main motivations behind this project.

      Weaknesses: 

      The fertility phenotyping in this study is limited, yet dissecting the mechanistic roles of these proteins falls beyond its scope. Nevertheless, this work serves as an invaluable resource for further exploration of specific genes of interest.

      Please see the previous point.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors): 

      Although the manuscript already includes a significant amount of data, there are two aspects that the authors may consider exploring: 

      (1) I understand that the choice of species whose gene expression was analyzed in the study was largely influenced by the quality of the corresponding genome annotations. However, since in evolutionary terms humans and mice are much closer to each other than Drosophila (as also shown in Figure 1c and Supplementary Figure 1), I found the statement "three evolutionarily distant gonochoric species" partially questionable. Have the authors considered adding an additional established animal model, such as for example zebrafish, to provide further coverage of the evolutionary space? Or, alternatively, could a posteriori analysis of the transcriptome of such an additional species be used to cross-validate their findings? The authors touch upon this point in the Discussion, but I wonder if they actually tried something in this direction, or simply decided that the currently available expression data from other organisms was too poor to be used for this purpose.

      We thank the Reviewer for bringing up this point, as it echoes one of our main concerns in terms of our approach (as discussed in lines 487-492). Indeed, when we were designing our study, we extensively discussed whether zebrafish and C. elegans datasets should be included, as high-quality expression and phenotypical data were available for both species. We ended up not including them for one main reason: the sexual system of these species deviates from that of humans, mice and fruit flies (all gonochoric species). More specifically, C. elegans are hermaphrodites and although zebrafish is a gonochoric species at the adult stage, they start their lifecycle as juvenile hermaphrodites (they first develop juvenile ovaries that later degenerate into a testis in males). Since it is largely unknown to what extent the transcriptome of male germ cells from these species deviates from the gonochoric program (by retaining oogenesis-related characteristics, for example), we decided to avoid possible confounding effects by excluding the two species. Undoubtedly, as more transcriptomic data from non-model organisms become available, these (and other) questions can be extensively revisited as our pipeline was designed to easily accommodate new data.

      (2) Although the use of the STRING database is a sensible choice given the general purpose of this work, in my experience the reliability of its individual interactions can vary significantly. I wonder if the authors have considered exploiting AlphaFold-Multimer as a parallel approach to estimate what proportion of the 79 functional interactions that they identified may reflect direct protein-protein contacts.

      We thank the Reviewer for this question and suggestion, as we were also concerned about STRING's reliability for individual interactions. For that
reason, we only utilized protein-protein interactions with a STRING combined confidence score ≥0.5
(corresponding to the estimated likelihood of a given association being
true), as described in more detail in the "Protein-protein interaction
(PPI) network construction" subsection. In addition, to make sure we were not biasing results towards conserved genes (which could arguably be overrepresented in STRING) we pursued a random rewiring test of degree
centrality and page rank, as detailed in section "Deeply conserved genes
are central components of the male germ cell transcriptome". We very much like the suggestion of using AlphaFold-Multimer to estimate the proportion of
direct protein-protein contacts for the 79 core interactions, but given
the already quite complex analytical pipeline of the present work, we will leave such analysis for a follow-up study. The final version of the manuscript now contains a reference to such an approach (lines 499-502).

      Finally, probably because my primary focus is not on gene regulation, I must say that I found the manuscript somewhat heavy to read. The integration of various data types and analyses, while enriching, also complicates the ability to clearly recall the main conclusions of each result section by the time one reaches the summary at the beginning of the Discussion. Given the relative brevity of the latter, expanding it to both reiterate what these conclusions are and illustrate how all the components converge to support the central message of the study would, in my opinion, benefit a general readership.  

      We thank the Reviewer for their fresh perspective on our document and for this most welcome suggestion. The final version of the manuscript now includes a longer discussion, containing an initial paragraph (lines 467-479) that summarizes our main findings and how they converge into a coherent body of work.

      Additionally, on a minor note, I suggest that the concept of phylostratigraphy be briefly explained when first mentioned in the Introduction, rather than later in the manuscript. This early clarification would aid comprehension for readers unfamiliar with the term. 

      To safeguard the flow of the manuscript, we have slightly tweaked the introduction section to avoid the use of highly specific terminology (such as phylostratigraphy) this early in the text. We replaced it with “comparison of genome sequences” (line 85). Phylostratigraphy is later explained in full detail in the corresponding section of the manuscript. We thank the Reviewer for this helpful suggestion.

      Reviewer #2 (Recommendations For The Authors): 

      Major concern - the absence of main text figures.

      We thank the Reviewer for this helpful comment. We will ensure that the three main figures are correctly formatted in the final version of the manuscript.

      Typos throughout - this will need your attention. 

      The Authors thank the Reviewer for the thorough and attentive assessment of our work. We have carefully revised the text to ensure a pleasant reading experience free of typographical errors.

    1. eLife assessment

      This manuscript describes an unexpected role of cellular caspases in cleaving Drp1, a protein involved in mitochondrial fission, in virus-infected cells. Drp1 cleavage augments mitochondrial fission, reinforcing MAVS-dependent type-1 IFN response against multiple viruses. The findings presented in this manuscript are important and the strength of evidence is solid. Additional studies may allow for more robust mechanistic substantiation of the proposed model.

    2. Reviewer #1 (Public Review):

      Drp1 supports mitochondrial fission (doi: 10.1038/s41586-019-1296-y). Viral sensing triggers mitochondrial fusion, leading to MAVS aggregation and improved type-1 IFN response. It was suggested that impairment of Drp1 upon phosphorylation by Tbk1 enhances mitochondrial fusion in virus-infected cells (doi.org/10.1016/j.molcel.2020.10.018). In this manuscript, Fang et al. describe an unexpected role of caspases activated upon Rift Valley fever virus (RVFV) infection in inactivating Drp1. They show that Drp1 is targeted by multiple caspases, including caspase-3, -6, -7 and -8. Indeed, cleavage of Drp1 leads to mitochondrial elongation, boosting the type-1 IFN response of infected cells. Finally, the authors establish the generalisability of the proposed mechanism in the context of cellular infections with H1N1, SeV, and HSV-1. Caspase-dependent and independent cell death processes provide important host defence mechanisms against obligatorily intracellular viral pathogens. This work suggests that caspases reinforce antiviral response involving also the mitochondria-type 1 IFN axis. As such, the manuscript is well written, and the proposal pertaining to caspase-mediated targeting of Drp1 may have implications beyond host-virus interaction studies. However, several loose ends remain, and these concerns need to be addressed to substantiate the mechanistic model.

    3. Reviewer #2 (Public Review):

      In the present study, authors report the role of virus-induced apoptosis in positively regulating the innate immune response. Upon infection, host cell apoptosis is triggered as a defence mechanism against virus replication. Culmination of infected-cell death impairs replicative potential for viruses, hence attenuating virus propagation. Reports exist denoting the inhibitory effect of apoptosis upon innate immune signalling. Contrary to that, the findings of this manuscript underscore the possible role of apoptosis in enhancing innate immune signalling and effector response. Infection-induced activation of caspases (3, 6, 7, and 8) has been demonstrated to cleave DRP1 protein. DRP1, a positive regulator for mitochondrial fission, degradation leads to altered mitochondrial morphology (elongation).

      Mitochondria, being a hub for innate immune signalling (via operation of RLR-MAVS-downstream effector molecule-axis), upon elongation as a result of DRP1 depletion, results in greater innate immune signal flux and interferon induction. Increased interferon induction thus acts to inhibit virus propagation, as demonstrated by the authors using cell-culture models.

      Strengths:

      (1) The findings presented by the authors have been validated by employing elaborate biochemical experimental approaches. The study entails extensive biochemical characterization of DRP1 residues targeted by activated caspases, in vitro assays validating caspase-mediated DRP1 cleavage & caspase-DRP1 interaction.

      (2) This study possesses broad implications since the authors demonstrate the role of caspase-mediated DRP1 cleavage in promoting innate immunity in the context of infection by diverse viruses (both RNA and DNA viruses).

      Weaknesses:

      Although the authors undertook a thorough experimental approach attempting to validate their findings, all the experiments were performed using either cell-culture models for infection or in vitro biochemical assays (cleavage and protein-protein interaction). Additional experimentation using animal models (in vivo) will further help strengthen the biological significance of their findings under more physiological settings.

    4. Reviewer #3 (Public Review):

      Summary:

      The authors demonstrated that the NSs protein of RVFV triggers the activation of apoptotic caspases, which cleave the mitochondrial fission factor DRP1 resulting in mitochondrial elongation.

      Strengths:

      The manuscript provides an insightful investigation into a novel mechanism through which apoptotic caspases promote anti-viral immunity by regulating mitochondrial morphodynamics.

    1. eLife assessment

      This important work substantially advances our understanding of sperm motility regulation during fertilization process by uncovering the midpiece/mitochondria contraction associated with motility cessation and structural changes in the midpiece actin network as its mode of action involved. The evidence supporting the conclusion is solid, with rigorous live cell imaging using state-of-art microscopy, although more functional analysis of the midpiece/mitochondria contraction would have further strengthened the study. The work will be of broad interest to cell biologists working on the cytoskeleton, mitochondria, cell fusion, and fertilization.

    2. Reviewer #2 (Public Review):

      Summary:

      The authors used state-of-the-art microscopy to analyze the structural changes that occur in sperm tails after the acrosome reaction. They found that midpiece contraction and actin reorganization occurred, which is associated with the cessation of flagellar motility during sperm-egg fusion. The mechanism by which flagellar motility is arrested during sperm-oocyte fusion is unknown, and this study proposes its novel mechanism and provides important insights for cell and reproductive biologists.

      In the revised manuscript, the authors addressed most of my concerns.

      Strength:

      Various microscopy techniques including super-resolution microscopy and scanning electron microscopy were used to analyze structural organization of the midpiece in detail.

    3. Reviewer #3 (Public Review):

      While progressive and also hyperactivated motility are required for sperm to reach the site of fertilization and to penetrate oocyte's outer vestments, during fusion with the oocyte's plasma membrane it has been observed that sperm motility ceases. Identifying the underlying molecular mechanisms would provide novel insights into a crucial but mostly overlooked physiological change during the sperm's life cycle. In this publication the authors aim to provide evidence that the helical actin structure surrounding the sperm mitochondria in the midpiece plays a role in regulating sperm motility, specifically the motility arrest during sperm fusion but also during earlier cessation of motility in a subpopulation of sperm post acrosomal exocytosis.

      The main observation the authors make is that in a subpopulation of sperm undergoing acrosomal exocytosis and sperm that fuse with the plasma membrane of the oocyte display a decrease in midpiece parameter of 30 nm. The authors propose the decrease in midpiece diameter via various microscopy techniques based on membrane dyes and bright-field images. In the revised version of the manuscript, a change in midpiece diameter is now confirmed via electron microscopy, even though the difference is not significant. The authors also propose that the midpiece diameter decrease is driven by changes in sperm intracellular Ca2+ and structural changes of the actin helix network. Future studies are still needed to confirm the casualty of these events and explore the discrepancy between fluorescence microscopy results and SEM. Overall, the authors should further tone down their conclusions.

    1. eLife assessment

      This useful study reports datasets on gene expression and chromatin accessibility profiles of spermatogonia at different postnatal ages in mice. The supporting data are considered incomplete. This study may be of interest to biomedical researchers working on male germline stem cells and male fertility.

    2. Reviewer #1 (Public Review):

      Summary

      This study was designed to investigate changes in gene expression and associated chromatin accessibility patterns in spermatogonia in mice at different postnatal stages from pups to adults. The objective was to describe dynamic changes in these patterns that potentially correlate with functional changes in spermatogonia as a function of development and reproductive maturation. The potential utility of this information is to serve as a reference against which similar data from animals subjected to various disruptive environmental influences can be compared.

      Major Strengths and Weaknesses of the Methods and Results

      A strength of the study is that it reviews previously published datasets describing gene expression and chromatin accessibility patterns in mouse spermatogonia. A weakness of the study is that it is not clear what new information is provided by the data provided that was not already known from previously published studies (see below). Specific weaknesses include the following...

      - Terminology - In the Abstract and first part of the Introduction the authors use the generic term "spermatogonial cells" in a manner that seems to be referring primarily to spermatogonial stem cells (SSCs) but initially ignores the well-known heterogeneity among spermatogonia - particularly the fact that only a small proportion of developing spermatogonia become SSCs - and ONLY those SSCs and NOT other developing spermatogonia - support steady-state spermatogenesis by retaining the capacity to either self-renew or contribute to the differentiating spermatogenic lineage throughout the male reproductive lifespan. The authors eventually mention other types of developing male germ cells, but their description of prospermatogonial stages that precede spermatogonial stages is deficient in that M-prospermatogonia - which occur after PGCs but before T1-prospermatogonia - are not mentioned. This description also seems to imply that all T2-prospermatogonia give rise to SSCs which is far from the case. It is the case that prospermatogonia give rise to spermatogonia, but only a very small proportion of undifferentiated spermatogonia form the foundational SSCs and ONLY SSCs possess the capacity to either self-renew or give rise to sequential waves of spermatogenesis.

      - Introduction - Statements regarding distinguishing transcriptional signatures in spermatogonia at different postnatal stages appear to refer to ALL subtypes of spermatogonia present at each stage collectively, thereby ignoring the well-known fact that there are distinct spermatogonial subtypes present at each postnatal stage and that some of those occur at certain stages but not at others. This brings into question the usefulness of the authors' discussion of what types of genes are expressed and/or what types of changes in chromatin accessibility are detected in spermatogonia at each stage.

      - Methodology - The authors based recovery (enrichment) of spermatogonia from male pups on FACS sorting for THY1 and RMV-1. While sorting total testis cells for THY1+ cells does enrich for spermaogonia, this approach is now known to not be highly specific for spermatogonia (somatic cells are also recovered) and definitely not for SSCs. There are more effective means for isolating SSCs from total testis cells that have been validated by transplantation experiments (e.g. use of the Id4/eGFP transgene marker).

      The authors then used "deconvolution" of bulk RNA-seq data in an attempt to discern spermatogonial subtype-specific transcriptomes. It is not clear why this is necessary or how it is beneficial given the availability of multiple single-cell RNA-seq datasets already published that accomplish this objective quite nicely - as the authors essentially acknowledge. Beyond this concern, a potential flaw with the deconvolution of bulk RNA-seq data is that this is a derivative approach that requires assumptions/computational manipulations of apparent mRNA abundance estimates that may confound interpretation of the relative abundance of different cellular subtypes within the hetergeneous cell population from which the bulk RNA-seq data is derived. Bottom line, it is not clear that this approach affords any experimental advantage over use of the publicly available scRNA-seq datasets and it is possible that attempts to employ this approach may be flawed yielding misleading data.

      - Results & Discussion - In general, much of the information reported in this study is not novel. The authors' discussion of the makeup of various spermatogonial subtypes in the testis at various ages does not really add anything to what has been known for many years on the basis of classic morphological studies. Further, as noted above, the gene expression data provided by the authors on the basis of their deconvolution of bulk RNA-seq data does not add any novel information to what has been shown in recent years by multiple elegant scRNA-seq studies - and, in fact, as also noted above - represents an approach fraught with potential for misleading results. The potential value of the authors' report of "other cell types" not corresponding to major somatic cell types identified in earlier published studies seems quite limited given that they provide no follow-up data that might indicate the nature of these alternative cell types. Beyond this, much of the gene expression and chromatin accessibility data reported by the authors - by their own admission given the references they cite - is largely confirmatory of previously published results. Similarly, results of the authors' analyses of putative factor binding sites within regions of differentially accessible chromatin also appear to confirm previously reported results. Ultimately, it is not at all novel to note that changes in gene expression patterns are accompanied by changes in patterns of chromatin accessibility in either related promoters or enhancers. The discussion of these observations provided by the authors takes on more of a review nature than that of any sort of truly novel results. As a result, it is difficult to discern how the data reported in this manuscript advance the field in any sort of novel or useful way beyond providing a review of previously published studies on these topics.

      Likely impact - The likely impact of this work is relatively low because, other than the value it provides as a review of previously published datasets, the new datasets provided are not novel and so do not advance the field in any significant manner.

    3. Reviewer #2 (Public Review):

      This revised manuscript attempts to explore the underlying chromatin accessibility landscape of spermatogonia from the developing and adult mouse testis. The key criticism of the first version of this manuscript was that bulk preparations of mixed populations of spermatogonia were used to generate the data that form the basis of the entire manuscript. To address this concern, the authors applied a deconvolution strategy (CIBERSORTx (Newman et al., 2019)) in an attempt to demonstrate that their multi-parameter FACS isolation (from Kubota 2004) of spermatogonia enriched for PLZF+ cells recovered spermatogonial stem cells (SSCs). PLZF (ZBTB16) protein is a transcription factor known to mark all or nearly all undifferentiated spermatogonia and some differentiating spermatogonia (KIT+ at the protein level) - see Niedenberger et al., 2015 (PMID: 25737569). The authors' deconvolution using single-cell transcriptomes produced at postnatal day 6 (P6) argue that 99% of the PLZF+ spermatogonia at P8 are SSCs, 85% at P15 and 93% in adults. Quite frankly given the established overlap between PLZF and KIT and known identity of spermatogonia at these developmental stages, this is impossible. Indeed - the authors' own analysis of the reference dataset demonstrates abundant PLZF mRNA in P6 progenitor spermatogonia - what is the authors' explanation for this observation? The same is essentially true in the use of adult references for celltype assignment. The authors found 63-82% of SSCs using this different definition of types (from a different dataset), begging the question of which of these results is true.

      In their rebuttal, the authors also raise a fair point about the precision of differential gene expression among spermatogonial subsets. At the mRNA level, Kit is definitely detectable in undifferentiated spermatogonia, but it is never observed at the protein level until progenitors respond to retinoic acid (see Hermann et al., 2015). I agree with the authors that the mRNAs for "cell type markers" are rarely differentially abundant at absolute levels (0 or 1), but instead, there are a multitude of shades of grey in mRNA abundance that "separate" cell types, particularly in the male germline and among the highly related spermatogonial subtypes of interest (SSCs, progenitor spermatogonia and differentiating spermatogonia). That is, spermatogonial biology should be considered as a continuous variable (not categorical), so examining specific cell populations with defined phenotypes (markers, function) likely oversimplifies the underlying heterogeneity in the male germ lineage. But, here, the authors have ignored this heterogeneity entirely by selecting complex populations and examining them in aggregate. We already know that PLZF protein marks a wide range of spermatogonia, complicating the interpretation of aggregate results emerging from such samples. In their rebuttal, the authors nicely demonstrate the existence of these mixtures using deconvolution estimation. What remains a mystery is why the authors did not choose to perform single-cell multiome (RNA-seq + ATAC-seq) to validate their results and provide high-confidence outcomes. This is an accessible technique and was requested after the initial version, but essentially ignored by the authors.

      A separate question is whether these data are novel. A prior publication by the Griswold lab (Schleif et al., 2023; PMID: 36983846) already performed ATAC-seq (and prior data exist for RNA-seq) from germ cells isolated from synchronized testes. These existing data are higher resolution than those provided in the current manuscript because they examine germ cells before and after RA-induced differentiation, which the authors do not base on their selection methods. Another prior publication from the Namekawa lab extensively examined the transcriptome and epigenome in adult testes (Maezawa et al., 2000; PMID: 32895557; and several prior papers). The authors should explain how their results extend our knowledge of spermatogonial biology in light of the preceding reports.

      The authors are also encouraged to improve their use of terminology to describe the samples of interest. The mitotic male germ cells in the testis are called spermatogonia (not spermatogonial cells, because spermatogonia are cells). Spermatogonia arise from Prospermatogonia. Spermatogonia are divisible into two broad groups: undifferentiated spermatogonia (comprised of few spermatogonial stem cells or SSCs and many more progenitor spermatogonia - at roughly 1:10 ratio) and differentiating spermatogonia that have responded to RA. The authors also improperly indicate that SSCs directly produce differentiating spermatogonia - indeed, SSCs produce transit-amplifying progenitor spermatogonia, which subsequently differentiate in response to retinoic acid stimulation. Further, the use of Spermatogonial cells (and SPGs) is imprecise because these terms do not indicate which spermatogonia are in question. Moreover, there have been studies in the literature which have used similar terms inappropriately to refer to SSCs, including in culture. A correct description of the lineage and disambiguation by careful definition and rigorous cell type identification would benefit the reader.

      Overall, my concern from the initial version of this manuscript stands - critical methodological flaws prevent interpretation of the results and the data are not novel. Readers should take note that results in essentially all Figures do not reflect the biology of any one type of spermatogonium.

    4. Reviewer #3 (Public Review):

      In this study, Lazar-Contes and colleagues aimed to determine whether chromatin accessibility changes in the spermatogonial population during different phases postnatal mammalian testis development. Because actions of the spermatogonial population set the foundation for continual and robust spermatogenesis and the gene networks regulating their biology are undefined, the goal of the study has merit. To advance knowledge, the authors used mice as a model and isolated spermatogonia from three different postnatal developmental age points using cell sorting methodology that was based on cell surface markers reported in previous studies and then performed bulk RNA-sequencing and ATAC-sequencing. Overall, the technical aspects of the sequencing analyses and computational/bioinformatics seems sound but there are several concerns with the cell population isolated from testes and lack of acknowledgement for previous studies that have also performed ATAC-sequencing on spermatogonia of mouse and human testes. The limitations, described below, call into question validity of the interpretations and reduce the potential merit of the findings.

      I suggest changing the acronym for spermatogonial cells from SC to SPG for two reasons. First, SPG is the commonly used acronym in the field of mammalian spermatogenesis. Second, SC is commonly used for Sertoli Cells.

      The authors should provide a rationale for why they used postnatal day 8 and 15 mice.

      The FACS sorting approach used was based on cell surface proteins that are not germline specific so there was undoubtedly somatic cells in the samples used for both RNA and ATAC sequencing. Thus, it is essential to demonstrate the level of both germ cell and undifferentiated spermatogonial enrichment in the isolated and profiled cell populations. To achieve this, the authors used PLZF as a biomarker of undifferentiated spermatogonia. Although PLZF is indeed expressed by undifferentiated spermatogonia, there have been several studies demonstrating that expression extends into differentiating spermatogonia. In addition, PLZF is not germ cell specific and single cell RNA-seq analyses of testicular tissue has revealed that there are somatic cell populations that express Plzf, at least at the mRNA level. For these reasons, I suggest that the authors assess the isolated cell populations using a germ cell specific biomarker such as DDX4 in combination with PLZF to get a more accurate assessment of the undifferentiated spermatogonial composition. This assessment is essential for interpretation of the RNA-seq and ATAC-seq data that was generated.

      A previous study by the Namekawa lab (PMID: 29126117) performed ATAC-seq on a similar cell population (THY1+ FACS sorted) that was isolated from pre-pubertal mouse testes. It was surprising to not see this study referenced to in the current manuscript. In addition, it seems prudent to cross-reference the two ATAC-seq datasets for commonalities and differences. In addition, there are several published studies on scATAC-seq of human spermatogonia that might be of interest to cross-reference with the ATAC-seq data presented in the current study to provide an understanding of translational merit for the findings.

    1. eLife assessment

      This useful study draws on published single-cell and spatial transcriptomic data of colon cancer liver metastasis to clarify the pro- and anti-tumorigenic properties of NK cells. The authors discover increased GZMK+ resting NK cells in the tumor tissue and reduced abundance of KIR2DL4+ activated NK cells. However, the evidence is currently incomplete, as the models used to validate the hypothesis and claims are not adequate and lack the necessary controls.

    1. eLife assessment

      This manuscript presents a valuable machine-learning-based approach to the automated detection of urine and fecal deposits by rodents, key ethological behaviors that have traditionally been very poorly studied. The strength of evidence for their claim, however, that the method provides "easy, efficient, and unbiased spatiotemporal analysis of scent marking during behavioral experiments" is incomplete. In particular, there were concerns about the generalizability of the approach, the relatively limited detection capabilities of the method, and a lack of rationale for specific design choices. This manuscript could be of interest to researchers in animal behavior, neuroscience, and automated animal tracking.

    2. Reviewer #1 (Public Review):

      Summary:<br /> The manuscript provides a novel method for the automated detection of scent marks from urine and feces in rodents. Given the importance of scent communication in these animals and their role as model organisms, this is a welcome tool.

      Strengths:<br /> The method uses a single video stream (thermal video) to allow for the distinction between urine and feces. It is automated.

      Weaknesses:<br /> The accuracy level shown is lower than may be practically useful for many studies. The accuracy of urine is 80%. This is understandable given the variability of urine in its deposition, but makes it challenging to know if the data is accurate. If the same kinds of mistakes are maintained across many conditions it may be reasonable to use the software (i.e., if everyone is under/over counted to the same extent). Differences in deposition on the scale of 20% would be challenging to be confident in with the current method, though differences of the magnitude may be of biological interest. Understanding how well the data maintain the same relative ranking of individuals across various timing and spatial deposition metrics may help provide further evidence for the utility of the method.

    3. Reviewer #2 (Public Review):

      Summary:<br /> The authors built a tool to extract the timing and location of mouse urine and fecal deposits in their laboratory set up. They indicate that they are happy with the results they achieved in this effort.

      The authors note urine is thought to be an important piece of an animal's behavioral repertoire and communication toolkit so methods that make studying these dynamics easier would be impactful.

      Strengths:<br /> With the proposed method, the authors are able to detect 79% of the urine that is present and 84% of the feces that is present in a mostly automated way.

      Weaknesses:<br /> The method proposed has a large number of design choices across two detection steps that aren't investigated. I.e. do other design choices make the performance better, worse, or the same? Are these choices robust across a range of laboratory environments? How much better are the demonstrated results compared to a simple object detection pipeline (i.e. FasterRCNN or YOLO on the raw heat images)?

      The method is implemented with a mix of MATLAB and Python.

      One proposed reason why this method is better than a human annotator is that it "is not biased." While they may mean it isn't influenced by what the researcher wants to see, the model they present is still statistically biased since each object class has a different recall score. This wasn't investigated. In general there was little discussion of the quality of the model. Precision scores were not reported. Is a recall value of 78.6% good for the types of studies they and others want to carry out? What are the implications of using the resulting data in a study? How do these results compare to the data that would be generated by a "biased human?"

      5 out of the 6 figures in the paper relate not to the method but to results from a study whose data was generated from the method. This makes a paper, which, based on the title, is about the method, much longer and more complicated than if it focused on the method. Also, even in the context of the experiments, there is no discussion of the implications of analyzing data that was generated from a method with precision and recall values of only 70-80%. Surely this noise has an effect on how to correctly calculate p-values etc. Instead, the authors seem to proceed like the generated data is simply correct.

    4. Reviewer #3 (Public Review):

      Summary:<br /> The authors introduce a tool that employs thermal cameras to automatically detect urine and feces deposits in rodents. The detection process involves a heuristic to identify potential thermal regions of interest, followed by a transformer network-based classifier to differentiate between urine, feces, and background noise. The tool's effectiveness is demonstrated through experiments analyzing social preference, stress response, and temporal dynamics of deposits, revealing differences between male and female mice.

      Strengths:<br /> The method effectively automates the identification of deposits<br /> The application of the tool in various behavioral tests demonstrates its robustness and versatility.<br /> The results highlight notable differences in behavior between male and female mice

      Weaknesses:<br /> The definition of 'start' and 'end' periods for statistical analysis is arbitrary. A robustness check with varying time windows would strengthen the conclusions.<br /> The paper could better address the generalizability of the tool to different experimental setups, environments, and potentially other species.<br /> The results are based on tests of individual animals, and there is no discussion of how this method could be generalized to experiments tracking multiple animals simultaneously in the same arena (e.g., pair or collective behavior tests, where multiple animals may deposit urine or feces).

    5. Author response:

      We want to thank the reviewers for their constructive feedback.

      General

      The recall values of our method range between 78.6% for all urine cases to 83.3% for feces (and not between 70-80%, as stated by reviewer #2), with a mean precision of 85.6%. This is rather similar to other machine learning-based methods commonly used for the analysis of complicated behavioral readouts. For example, in the paper presenting DeepSqueak for analysis of mouse ultrasonic vocalizations (Coffey et al. DeepSqueak: a deep learning-based system for detection and analysis of ultrasonic vocalizations. Neuropsychopharmacol. 44, 859–868 (2019). https://doi.org/10.1038/s41386-018-0303-6), the recall values reported for both DeepSqueak, Mupet and Ultravox (Fig. 2c, f) are very similar to our method.

      We have analyzed and reported all the types of errors made by our methods, which are mostly technical. For example, depositions that overlap the mouse blob for too long till getting cold will be associated with the mouse and therefore will not be detected (“miss” events). These technical errors are not supposed to create a bias for a specific biological condition and, hence, shouldn’t interfere with the use of our method. A video showing all of the mistakes made by our algorithm on the test set was submitted (Figure 2-video 1).

      Below we will to relate to specific points and describe our plan to revise the manuscript accordingly.

      Detection accuracy

      a. It should be noted that when large urine spots are considered, our algorithm got 100% correct classification (Figure 2, supplement 1, panel b). However, small urine deposits are very similar to feces in their appearance in the thermal picture. In fact,  if the feces are not shifted, discrimination can be quite challenging even for human annotators. To demonstrate the accuracy of the proposed method relative to human annotators, we plan to compare its results with the accuracy of a second human annotator.

      b. As part of the revision, we plan to test general machine learning-based object detectors such as faster-RCNN or YOLO (as suggested by Reviewer 2) and compare them with our method.

      c. To check if our method may introduce bias to the results, we plan to check if the errors are distributed evenly across time, space, and genders.

      Design choices

      (A) The preliminary detection algorithm has several significant parameters. These are:

      a. Minimal temperature rise for detection: 1.1°C rise during 5 sec.

      b. Size limits of the detection: 2 - 900 pixels.

      c. Minimal cooldown during 40 sec: 1.1°C and at least half the rise.

      d. Minimal time between detections in the same location: 30 sec.

      We chose to use low thresholds for the preliminary detection to allow detection of very small urinations and to minimize the number of “miss” events, relying on the classifier to robustly reject false alarms. Indeed, we achieved a low rate of miss events: 5 miss events for the entire test set (1 miss event per ~90 minutes of video). We attribute these 5 “miss” events to partial occlusion of the detection by the mouse.

      To adjust the preliminary detection parameters to a new environment, one will need to calibrate these parameters in their own setup. Mainly, the size of the detection depends on the resolution of the video, and the cooldown rate might be affected by the material of the floor, as well as the room temperature.

      We plan to explore the robustness of these parameters in our setup and report the influence on the accuracy of the preliminary algorithm.

      (B) We chose to feed the classifier with 71 seconds of videos (11 seconds before the event and 60 seconds after it) as we wanted the classifier to be able to capture the moment of the deposition, the cooldown process, as well as urine smearing or feces shifting which might give an additional clue for the classification. In the revised paper we plan to report accuracy when using a shorter video for classification.

      Generability

      a. In the revised version, we plan to report the accuracy of the method used on a different strain of mice (C57), with a different arena color (white arena instead of black).

      Statistics

      a. In the revised paper, we will explain why we chose each time window for analysis. Also, we will report statistics for different time windows, as suggested by Reviewer 3.

      b. Unlike reviewer #2, we don’t think that the small difference in recall rate between urine and feces (78.6% vs. 83.3%, respectively) creates a bias between them. Moreover, we don’t compare the urine rate to the feces rate.

      c. In the revised manuscript we will explicitly report the precision scores, although they also appear in our manuscript in Fig. 2- Supplement 1b.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Reviewer 1:

      • Although ROC AUC is a widely used metric. Other metrics such as precision, recall, sensitivity, and specificity are not reported in this work. The last two metrics would help readers understand the model’s potential implications in the context of clinical research.

      In response to this comment and related ones by Reviewer 2, we have overhauled how we evaluate our models. In the revised version, we have removed Micro ROC-AUC, as this evaluation metric is hard to interpret in the recommender system setting. Instead, the updated version fully focuses on two metrics: ROC-AUC and Precision at 1 of the negative class, both computed per spectrum and then averaged (equivalent to the instance-wise metrics in the previous version of the manuscript). We believe these metrics best reflect the use-case of AMR recommenders. In addition, we have kept (drug-)macro ROC-AUC as a complementary evaluation metric. As the ROC-AUC can be decomposed into sensitivity and specificity (at different prediction probability thresholds), we have added a ROC curve where sensitivity and specificity are indicated in Figure 8 (Appendices).

      • The authors did not hypothesize or describe in any way what an acceptable performance of their recommender system should be in order to be adopted by clinicians.

      In Section 4.3, we have extended our experiments to include a baseline that represents a “simulated expert”. In short, given a species, an expert can already make some best guesses as to what drugs will be effective or not. To simulate this, we count resistance frequencies per species and per drug in the training set, and use this as predictions of a “simulated expert”.

      We now mention in our manuscript that any performance above this level results in a real-world information gain for clinical diagnostic labs.

      • Related to the previous comment, this work would strongly benefit from the inclusion of 1-2 real-life applications of their method that could showcase the benefits of their strategy for designing antibiotic treatment in a clinical setting.

      While we think this would be valuable to try out, we are an in silico research lab, and the study we propose is an initial proof-of-concept focusing on the methodology. Because of this, we feel a real-life application of the model is out-of-scope for the present study.

      • The authors do not offer information about the model features associated with resistance. This information may offer insights about mechanisms of antimicrobial resistance and how conserved they are across species.

      In general, MALDI-TOF mass spectra are somewhat hard to interpret. Because of a limited body of work analyzing resistance mechanisms with MALDI-TOF MS, it is hard to link peaks back to specific pathways. For this reason, we have chosen to forego such an analysis. After all, as far as we know, typical MALDI-TOF MS manufacturers’ software for bacterial identification also does not provide interpretability results or insights into peaks, but merely gives an identification and confidence score.

      However, we do feel that the whole topic revolving around “the degree of biological insight a data modality might give versus actual performance and usability” merits further discussion. We have ultimately decided not to include a segment in our discussion section as it is hard to discuss this matter concisely.

      • Comparison of AUC values across models lacks information regarding statistical significance. Without this information it is hard for a reader to figure out which differences are marginal and which ones are meaningful (for example, it is unclear if a difference in average AUC of 0.02 is significant). This applied to Figure 2, Figure 3, and Table 2 (and the associated supplementary figures).

      To make trends a bit more clear and easier to discern, in our revised manuscript, all models are run for 5 replicates (as opposed to 3 in the previous version).

      There is an ongoing debate in the ML community whether statistical tests are useful for comparing machine learning models. A simple argument against them is that model runs are typically not independent from each other, as they are all trained on the same data. The assumptions of traditional statistical tests are therefore violated (t-test, Wilcoxon test, etc.). With such tests statistical significance of the smallest differences can simply be achieved by increasing the number of replicates (i.e. training the same models more times).

      More complicated but more appropriate statistical tests also exist, such as the 5x2 cross-validated t-test of Dietterich: “Approximate statistical tests for comparing supervised classification learning algorithms”, Neural computation 1998. However, these tests are typically not considered in deep learning, because only 10% of the data can be used for training, which is practically not desirable. The Friedman test of Demšar "On the appropriateness of statistical tests in machine learning." Workshop on Evaluation Methods for Machine Learning in conjunction with ICML. 2008., in combination with posthoc pairwise tests, is still frequently used in machine learning, but that test is only applicable in studies where many datasets are tested.

      For those reasons, most deep learning papers that only analyse a few datasets typically do not consider any statistical tests. For the same reasons, we are also not convinced of the added value of statistical tests in our study.

      • One key claim of this work was that their single recommender system outperformed specialist (single species-antibiotic) models. However, in its current status, it is not possible to determine that in fact that is the case (see comment above). Moreover, comparisons to species-level models (that combine all data and antibiotic susceptibility profiles for a given species) would help to illustrate the putative advantages of the dual branch neural network model over species-based models. This analysis will also inform the species (and perhaps datasets) for which specialist models would be useful to consider.

      We thank the reviewer for this excellent suggestion. In our new manuscript, we have dedicated an entire section of experiments to testing such species-specific recommender models (Section 4.2). We find that species-specific recommender systems generally outperform the models trained globally across all species. As a result, our manuscript has been majorly reworked.

      • Taking into account that the clustering of spectra embeddings seemed to be species-driven (Figure 4), one may hypothesize that there is limited transfer of information between species, and therefore the neural network model may be working as an ensemble of species models. Thus, this work would deeply benefit from a comparison between the authors' general model and an ensemble model in which the species is first identified and then the relevant species recommender is applied. If authors had identified cases to illustrate how data from one species positively influence the results for another species, they should include some of those examples.

      See the answer to the remark above.

      • The authors should check that all abbreviations are properly introduced in the text so readers understand exactly what they mean. For example, the Prec@1 metric is a little confusing.

      See the answer to a remark above for how we have overhauled our evaluation metrics in the revised version. In addition, in the revised version, we have bundled our explanations on evaluation metrics together in Section 3.2. We feel that having these explanations in a separate section will improve overall comprehensibility of the manuscript.

      • The authors should include information about statistical significance in figures and tables that compare performance across models.

      See answer above.

      • An extra panel showing species labels would help readers understand Figure 11.

      We have tried to play around with including species labels in these plots, but could not make it work without overcrowding the figure. Instead, we have added a reminder in the caption that readers should refer back to an earlier figure for species labels.

      • The authors initially stated that molecular structure information is not informative. However, in a second analysis, the authors stated that molecular structures are useful for less common drugs. Please explain in more detail with specific examples what you mean.

      In the previous version of our manuscript, we found that one-hot embedding-based models were superior to structure-based drug embedders for general performance. The latter however, delivered better transfer learning performance.

      In our new experiments however, we perform early stopping on “spectrum-macro” ROC-AUC (as opposed to micro ROC-AUC in the previous version). As a consequence, our results are different. In the new version of our manuscript, Morgan Fingerprints-based drug embedders generally outperform others both “in general” and for transfer learning. Hence, our previously conflicting statements are not applicable to our new results.

      • The authors may want to consider adding a few sentences that summarize the 'Related work' section into the introduction, and converting the 'Related work' section into an appendix.

      While we acknowledge that such a section is uncommon in biology, in machine learning research, a “related work” section is very common. As this research lies on the intersection of the two, we have decided to keep the section as such.

      Reviewer 2:

      • Are the specialist models re-trained on the whole set of spectra? It was shown by Weis et al. that pooling spectra from different species hinders performance. It would then be better to compare directly to the models developed by Weis et al, using their splitting logic since it could be that the decay in performance from specialists comes from the pooling. See the section "Species-stratified learning yields superior predictions" in https://doi.org/10.1038/s41591-021-01619-9.

      We train our “specialist” (or now-called “species-drug classifiers”) just as described in Weis et al.: All labels for a drug are taken, and then subsetted for a single species. We have clarified this a bit better in our new manuscript. The text now reads:

      “Previous studies have studied AMR prediction in specific species-drug combinations. For this reason, it is useful to compare how the dual-branch setup weighs up against training separate models for separate species and drugs. In Weis et al. (2020b), for example, binary AMR classifiers are trained for the following three combinations: (1) E. coli with Ceftriaxone, (2) K. pneumoniae with Ceftriaxone, and (3) S. aureus with Oxacillin. Here, such "species-drug-specific classifiers" are trained for the 200 most-common combinations of species and drugs in the training dataset.

      • Going back to Weis et al. a high variance in performance between species/drug pairs was observed. The metrics in Table 2 do not offer any measurement of variance or statistical testing. Indeed, some values are quite close e.g. Macro AUROC of Specialist MLP-XL vs One-hot M.

      See our answer to a remark of Reviewer 1 for our viewpoint on statistical significance testing in machine learning.

      • Since this is a recommendation task, why were no recommendation system metrics used, e.g. mAP@K, mRR, and so (apart from precision@1 for the negative class)? Additionally, since there is a high label imbalance in this task (~80% negatives) a simple model would achieve a very high precision@1.

      See the answer to a remark above for how we have overhauled our evaluation metrics in the revised version. In addition, in choosing our metrics, we wanted metrics that are both (1) appropriate (i.e. recommender system metrics), but also (2) easy to interpret for clinicians. For this reason, we have not included metrics such as mAP@K or mRR. We feel that “spectrum-macro” ROC-AUC and precision@1 cover a sufficiently broad evaluation set of metrics but are easy enough to interpret.

      • A highly similar approach was recently published (https://doi.org/10.1093/bioinformatics/btad717). Since it is quite close to the publication date of this paper, it could be discussed as concurrent work.

      We thank the reviewer for bringing our attention to this study. We have added a paragraph in our revised version discussing this paper as concurrent work.

      • It is difficult to observe a general trend from Figure 2. A statistical test would be advised here.

      See our answer to a remark of Reviewer 1 for our viewpoint on statistical significance testing in machine learning.

      • Figure 5. UMAPs generally don't lead to robust quantitative conclusions. However, the analysis of the embedding space is indeed interesting. Here I would recommend some quantitative measures directly using embedding distances to accompany the UMAP visualizations. E.g. clustering coefficients, distribution of pairwise distances, etc.

      In accordance with this recommendation, we have computed many statistics on the MALDI-TOF spectra embedding spaces. However, we could not come up with any statistic that illuminated us more than the visualization itself. For this reason, we have kept this section as is, and let the figure speak for itself.

      • Weis et al. also perform a transfer learning analysis. How does the transfer learning capacity of the proposed models differ from those in Weis et al?

      Weis et al. perform experiments towards “transferability”, not actual transfer learning. In essence, they use a model trained on data from one diagnostic lab towards prediction on data from another. However, they do not conduct experiments to learn how much data such a pre-trained classifier needs to fine-tune it for adequate performance on the new diagnostic lab, as we do. The end of Section 4.4 discusses how our proposed models specifically shine in transfer learning. The paragraph reads:

      “Lowering the amount of data required is paramount to expedite the uptake of AMR models in clinical diagnostics. The transfer learning qualities of dual-branch models may be ascribed to multiple properties. First of all, since different hospitals use much of the same drugs, transferred drug embedders allow for expressively representing drugs out of the box. Secondly, owing to multi-task learning, even with a limited number of spectra, a considerable fine-tuning dataset may be obtained, as all available data is "thrown on one pile".”

    2. eLife assessment

      This valuable study presents a machine learning model to recommend effective antimicrobial drugs from patients' samples analysed with mass spectrometry. The evidence supporting the claims of the authors is convincing, although including a measure of statistical significance to compare different proposed models would further strengthen the support. This work will be of interest to computational biologists, microbiologists, and clinicians.

    3. Reviewer #1 (Public Review):

      Summary:

      De Waele et al. reported a dual-branch neural network model for predicting antibiotic resistance profiles using matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) mass spectrometry data. Neural networks were trained on the recently available DRIAMS database of MALDI-TOF mass spectrometry data and their associated antibiotic susceptibility profiles. The authors used dual branch neural network to simultaneously represent information about mass spectra and antibiotics for a wide range of species and antibiotic combinations. The authors showed consistent performance of their strategy to predict antibiotic susceptibility for different spectrum and antibiotic representations (i.e., embedders). Remarkably, the authors showed how small datasets collected at one location can improve the performance of a model trained with limited data collected at a second location. The authors also showed that species-specific models (trained in multiple antibiotic resistance profiles) outperformed both the single recommender model and the individual species-antibiotic combination models. Despite the promising results, the authors should explain in more detail some of the analyses reported in the manuscript (see weaknesses).

      Strengths:

      • A single AMR recommender system could potentially facilitate the adoption of MALDI-TOF based antibiotic susceptibility profiling into clinical practices by reducing the number of models to be considered, and the efforts that may be required to periodically update them.<br /> • Authors tested multiple combinations of embedders for the mass spectra and antibiotics while using different metrics to evaluate the performance of the resulting models. Models trained using different spectrum embedder-antibiotic embedder combinations had remarkably good performance for all tested metrics. The average ROC AUC scores for global and species-specific evaluations were above 0.8.<br /> • Authors developed species-specific recommenders as an intermediate layer between the single recommender system and single species-antibiotic models. This intermediate approach achieved maximum performance (with one type of the species-specific recommender achieving a 0.9 ROC AUC), outlining the potential of this type of recommenders for frequent pathogens.<br /> • Authors showed that data collected in one location can be leveraged to improve the performance of models generated using a smaller number of samples collected at a different location. This result may encourage researchers to optimize data integration to reduce the burden of data generation for institutions interested in testing this method.

      Weaknesses:

      • Section 4.3 ("expert baseline model"): the authors need to explain how the probabilities defined as baselines were exactly used to predict individual patient susceptible profiles.<br /> • Authors do not offer information about the model features associated with resistance. Although I understand the difficulty of mapping mass spectra to specific pathways or metabolites, mechanistic insights are much more important in the context of AMR than in the context of bacterial identification. For example, this information may offer additional antimicrobial targets. Thus, authors should at least identify mass spectra peaks highly associated with resistance profiles. Are those peaks consistent across species? This would be a key step towards a proteomic survey of mechanisms of AMR. See previous work on this topic: PMIDs: 35586072 and 23297261.

    4. Reviewer #2 (Public Review):

      The authors frame the MS-spectrum-based prediction of antimicrobial resistance prediction as a drug recommendation task. Weis et al. introduced the dataset this model is tested on and benchmark models which take as input a single species and are trained to predict resistance to a single drug. Instead here, a pair of drugs and spectrum are fed to 2 neural network models to predict a resistance probability. In this manner, knowledge from different drugs and species can be shared through the model parameters. Questions asked: 1. what is the best way to encode the drugs? 2. does the dual NN outperform the single spectrum-drug?

      Overall the paper is well-written and structured. It presents a novel framework for a relevant problem.

    1. eLife assessment

      The study offers a compelling molecular model for the organization of rootlets, a critical organelle that links cilia to the basal body, ensuring proper anchoring. While previous research has explored rootlet structure and organization, this study delivers an unprecedented level of resolution, valuable to the centrosome and cilia field. This research marks a significant step forward in our understanding of rootlets' molecular organization.

    2. Joint Public Review:

      The study offers a compelling molecular model for the organization of rootlets, a critical organelle that links cilia to the basal body. Striations have been observed in rootlets, but their assembly, composition, and function remain unknown. While previous research has explored rootlet structure and organization, this study delivers an unprecedented level of resolution, valuable to the centrosome and cilia field. The authors isolated rootlets from mice's eyes. They apply EM to partially purified rootlets (first negative stain, then cryoET). From these micrographs, they observed striations along the membranes along the rootlet but no regular spacing was observed.

      The thickness of the sample and membranes prevented good contrast in the tomograms. Thus they further purified the rootlets using detergent, which allowed them to obtain cryoET micrographs of the rootlets with greater details. The tomograms were segmented and further processed to improve the features of the rootlet structures. They proposed that a number of proteins, including rootletin, form parallel coiled coils that run along the rootlet longitudinally. They described how the cross-striations form 3 types of periodic structures -D1/D2/A bands- connected perpendicularly to filaments along the length of the rootlets and to membranes. Overall their data provide a detailed model for the molecular organization of the rootlet.

      The major strength is that this high-quality study uses state-of-the-art cryo-electron tomography, sub-tomogram averaging, and image analysis to provide a model of the molecular organization of rootlets. The micrographs are exceptional, with excellent contrast and details, which also implies the sample preparation was well optimized to provide excellent samples for cryo-ET. The manuscript is also clear and accessible.

      This research marks a significant step forward in our understanding of rootlets' molecular organization.

    3. Author response:

      The following is the authors’ response to the previous reviews.

      Reviewer #1 (Recommendations for the authors):

      In the revision the authors addressed all the points from this reviewer and most from other reviewers. The method is now described practically and in detail. The only thing this reviewer still misses is number of subtomograms for each structure. How many subtomograms did the authors extract by Dynamo from how many rootlets? How many out of them were valid in K-mean classification and used for sub-averages? Was the subaverage used for training by TomoSeg or each subtomograms belonging to the class? By clarifying it, this work will be referred by those who would take the same approach for other biological structures. 

      We now added the particle numbers of all structures to the corresponding text, figure legends and methods and elaborate on this below. We also clarify how we trained the TomoSeg network.

      Particle numbers:

      We extracted 591,453 subtomograms from 14 tomograms. This initial set was rigorously cleaned with Zcleaning, reducing it to 358,863 particles. Further cross-correlation and cluster cleaning yielded a final set of 180,252 particles. 

      This refined set was used for the structures presented in Figures 3E, F and S5A, B, as well as for the classification shown in Figure S5C. Of the classified particles, 34,490 particles contributed to classaverage 5 in Figure 3G and S5D, E. The detailed particle distribution of this classification is added as a supplementary table: 

      We further clarified the numbers in the results, method, and supplementary material section:

      Results:

      Page 7: “Figure 3. … (E) The initial average after alignment of 180,252 particles with a wide spherical alignment mask. (F) The initial average of particles aligned with a narrower cylindrical mask. (G) A class average of 34,490 particles, aligned and classified with a narrow mask.”

      Page 7/8: “We manually defined the D1-bands as surfaces in Dynamo (Castaño-Díez et al, 2017) and then approximated the number of filaments per surface area. We extracted 591,453 subtomograms from 14 tomograms, approximately four times as many subtomograms as the expected number of filaments. This initial set was rigorously cleaned to discard particles that did not have a filament in their center or had distorted striations, reducing it to 358,863 particles. Further cross-correlation and cluster cleaning yielded a final set of 180,252 particles.”

      Page 8: “We directly unbinned the data to a pixel size of 5.55 Å/pixel and used the rigorously cleaned set of 180,252 particles.”

      Page 8: “The resulting class averages contained a twist along the filament length in classes 2, 3 and 4 and most prominently in class 5. These four classes contain 72.29% of the particles, highlighting the prevalence of the twist-feature (Fig S5C, Table S2). Class 5 contained 19.27% of the data, i.e. 34,490 particles, and revealed the twist is formed by a filament of 2 nm thick by 5 nm wide with a helical groove along its length (Fig 3G).”

      Methods: 

      Page 13: “Surface triangulation was set to result in 591,453 extraction coordinates approximately 4 times the number of expected filaments.”

      Page 13: “Particles with no filament in their center, or particles that originated from regions in the rootlet with distorted striations (at the edge of a grid hole) were discarded, resulting in a particle set of 358,863 particles. Cluster- and careful per-tomogram cross-correlation cleaning were applied to remove particle duplicates, remaining particles with no filaments, and particles with disordered D-bands. This resulted in a final cleaned particle dataset of 180,252 particles.”

      Page 13: “For the final subtomogram class-average that contained the twist, the cleaned particle dataset motl with 180,252 particles was converted to a STAR file compatible with RELION 4.0 Alpha (Zivanov et al, 2022).”

      Supplementary material: 

      Page 17: “Table S1. Particle distribution of RELION 4.0 Alpha classification with alignment.”

      Page 22: “Figure S5: (C) Class averages of a classification with alignment of particles from Fig S5A. Their particle distribution is shown in Table S2.”

      For the initial classification, to identify a homogeneous subset, we used the original set of 591,453 picked particles (Fig S5A). The class distribution for this set is added as a supplementary table.

      We further clarified this in the results, methods and supplementary material:

      Results:

      Page 8: “To ask if there were any recurring arrangements of neighboring filaments in the data that could allow us to average a homogeneous subset, we resorted to classification of the original set of 591,453 particles (Fig S5A, Table S1).”

      Methods:

      Page 13: “Prior to classification in subTOM, alignments with limited X/Y/Z shifts and increasingly finer in-plane rotations were performed on the original dataset with 591,453 particles.”

      Supplementary material:

      Page 17: “Table S2. Particle distribution of subTOM classification for particle heterogeneity.”

      Page 22: “Figure S5: … The surfaces of a cross-section through the filament classes are shown in orange. The particle distribution is provided in Table S1. (B) …”

      TomoSeg network training

      The subtomograms and the class averages presented at the end of the manuscript were not used as input for training the TomoSeg network. TomoSeg training requires positive and negative sets of segmented 2D regions of interest within tomogram slices. These areas were selected and segmented within the Eman2 TomoSeg GUI, iteratively increasing the size of the training sets until satisfactory performance was achieved. 

      We have clarified the TomoSeg training process in the methods section to avoid confusion:

      Methods: 

      Page 13: “The tomograms were then preprocessed in EMAN2.2 for training of the TomoSeg CNN (Chen et al, 2017). Here, the features (filaments, D-bands, A-bands, gold fiducials, actin, membranes, membrane-associated densities and ice contaminations) were individually trained for each tomogram. This involved manually tracing a training set of 10-20 positive and 100-150 negative boxed areas per feature. We iteratively expanded and curated the training set until the segmentations were accurate, as recommended in the software manuals. Segmented maps were allowed to compete for the assignment of pixels in the tomograms, cleaned up in Amira (Thermo Fisher Scientific) and converted to object files.”

    1. eLife assessment

      Ctnnb1 encodes β-catenin, an essential component of the canonical Wnt signaling pathway. In this important study, the authors identify an upstream enhancer of Ctnnb1 responsible for the specific expression level of β-catenin in the gastrointestinal track. Deletion of this enhancer in mice and analyses of its association with human colorectal tumors provide compelling support that it controls the dosage of Wnt signaling critical to the homeostasis in intestinal epithelia and colorectal cancers.

    2. Reviewer #1 (Public review):

      Summary:

      Ctnnb1 encodes β-catenin, an essential component of the canonical Wnt signaling pathway. In this study, the authors identify an upstream enhancer of Ctnnb1 responsible for the specific expression level of β-catenin in the gastrointestinal track. Deletion of this promoter in mice and analyses of its association with human colorectal tumors support that it controls the dosage of Wnt signaling critical to the homeostasis in intestinal epithelia and colorectal cancers.

      Strengths:

      This study has provided convincing evidence to demonstrate the functions of a gastrointestinal enhancer of Ctnnb1 using combined approaches of bioinformatics, genomics, in vitro cell culture models, mouse genetics, and human genetics. The results support the idea that the dosage of Wnt/β-catenin signaling plays an important role in pathophysiological functions of intestinal epithelia. The experimental designs are solid and the data presented are of high quality. This study significantly contributes to the research fields of Wnt signaling, tissue-specific enhancers, and intestinal homeostasis.

      Weaknesses:

      Insufficient discussion on some findings was a major weakness in the previous submission, which has been addressed in the revised submission.

    3. Reviewer #2 (Public review):

      Wnt signaling is the name given to a cell-communication mechanism that cells employ to inform on each other's position and identity during development. In cells that receive the Wnt signal from the extracellular environment, intracellular changes are triggered that cause the stabilization and nuclear translocation of β-catenin, a protein that can turn on groups of genes referred to as Wnt targets. Typically these are genes involved in cell proliferation. Genetic mutations that affect Wnt signaling components can therefore affect tissue expansion. Loss of function of APC is a drastic example: APC is part of the β-catenin destruction complex, and in its absence, β-catenin protein is not degraded and constitutively turns on proliferation genes, causing cancers in the colon and rectum. And here lies the importance of the finding: β-catenin has for long been considered to be regulated almost exclusively by tuning its protein turnover. In this article, a new aspect is revealed: Ctnnb1, the gene encoding for β-catenin, possesses tissue-specific regulation with transcriptional enhancers in its vicinity that drive its upregulation in intestinal stem cells. The observation that there is more active β-catenin in colorectal tumors not only because the broken APC cannot degrade it, but also because transcription of the Ctnnb1 gene occurs at higher rates, is novel and potentially game-changing. As genomic regulatory regions can be targeted, one could now envision that mutational approaches aimed at dampening Ctnnb1 transcription could be a viable additional strategy to treat Wnt-driven tumors.

    4. Reviewer #3 (Public review):

      The authors of this paper identify an enhancer that upstream of the Ctnnb1 gene that selectively enhances expression in intestinal cells. This enhancer sequence drives expression of a reporter gene in the intestine and knockout of this enhancer attenuates Ctnnb1 expression in the intestine, while protecting mice from intestinal cancers. The human counterpart of this enhancer sequence is functional and involved in tumorigenesis. Overall, this is an excellent example of how to fully characterize a cell-specific enhancer. The strength of the study is the thorough nature of the analysis and the relevance of the data to development of intestinal tumors in both mice and humans. A minor weakness was that that loss of this enhancer does not completely compromise expression of Ctnnb1 gene in the intestine, suggesting that other elements are likely involved. The authors have now addressed this concern.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1:

      (1) One issue that needs to be considered is the nomenclature of the enhancer. The authors have presented data to show this enhancer controls the expression of Ctnnb1 in the stomach, intestine, and colon tissues. However, the name proposed by the authors, ieCtnnb1 (intestinal enhancer of Ctnnb1), doesn't represent its functions. It might be more appropriate to call it a different name, such as gieCtnnb1 (gastrointestinal enhancer of Ctnnb1).

      We thank the reviewer for the insightful suggestion and agree that wholemount reporter assays indicated ieCtnnb1 and ieCTNNB1 indeed display activity in the stomach. However, in current study, we focused on the cellular distribution and the function in intestinal epithelia. After careful consideration, we reasoned that the current designation, ieCtnnb1, would be more appropriately represent its expression pattern and functions based on provided evidence. We hope the reviewer could understand our reasoning.  

      (2) The writing of this manuscript can be improved in a few places. 

      a) The definitions or full names for the abbreviations of some terms, e.g., Ctnnb1, ieCtnnb1, in both abstract and main text, are needed when they first appear. Specifically, Line 108 should be moved to Lines 26 and 95. Lines 125126 are redundant. ieCtnnb1 in Line 130 needs to be defined.

      We appreciate the suggestion. In the revision, we have included the definition of Ctnnb1 and the full name of ieCtnnb1 when they first appear in the abstract and the main text. Lines 125-126 were deleted in the revision.

      b) Line 192-194, the description of the result needs to be rewritten to reflect

      the higher expression of LacZ transcript in eGFP+ cells. 

      We would like to emphasize that the key point of this part is that the enhancer activity of ieCtnnb1 is present in both Lgr5-eGFP+ and Lgr5-eGFP- cells. This was validated by single-cell sequencing, which revealed the presence of LacZ transcripts in the Paneth cells. Moreover, we could not confidently conclude that eGFP+ cells have higher expression levels of LacZ, as these measurements were obtained from separate, semi-quantitative RTqPCR experiments.

      c)  More details are needed for how the data using human tumor samples were generated and how they were analyzed. 

      We thank the suggestion. In the revision, we have provided additional details regarding the data and subsequent analyses of human CRC samples as follows: “We previously conducted paired analyses of chromatin immunoprecipitation sequencing (ChIP-seq) for H3K27ac and H3K4me3, alongside RNA-seq on 68 CRC samples and their adjacent normal (native) tissue (Li et al., 2021).  In the current study, we performed analyses for the enrichment of H3K27ac and H3K4me3 at ieCTNNB1 and CTNNB1 promoter regions, as well as the expression levels of CTNNB1, followed by combined analyses (Figure. 5A, Figure 5 - figure supplement 1).”

      d) The genomic structures from multiple species are presented at the bottom of Figure 1a. However, the description and explanation are lacking in both the main text and the figure legend.

      We apologize for not presenting clearly. We have added related description in the legend of Figure 1A as “The sequence conservation of the indicated species is shown at the bottom as vertical lines”. We also added an explanation in lines 162-163 of the main text: “Notably, unlike neCtnnb1, the primary sequence of ieCtnnb1 is not conserved among vertebrates (Figure 1A, bottom)”.

      Reviewer #2:

      (1) One of the main issues emerging during reading concerns the interpretation of the consequence of deleting the ieCtnnb1 enhancer. The authors write on line 235 that the deletion of ieCtnnb1 "undermined" Wnt signaling in the intestinal epithelium. This feels too strong, as the status of the pathway is only mildly affected, testified by the observation that mice with homozygous deletion on ieCtnnb1 are alive and well. The enhancer likely "only" drives higher Ctnnb1 expression, and it does not affect Wnt signaling by other mechanisms. The reduction of Wnt target gene expression upon its deletion is easily interpreted as the consequence of reduced β-catenin. Also the title, in my opinion, allows this ambiguity to stick in readers' minds. In other words, the authors present no evidence that the ieCtnnb1 enhancer controls Wnt signaling dosage via any mechanism other than its upregulation of Ctnnb1 expression in the intestinal epithelium. Reduced Ctnnb1, in turn, could explain the observed reduction of Wnt signaling output and the interesting downstream physiological consequences. Unless the authors think otherwise, I suggest they clarify this throughout the text, including necessary modifications to the title.

      We greatly appreciate the reviewer’s important comments and suggestion. We agree that ieCtnnb1’s direct effect on the canonical Wnt signaling is to regulate the transcription of Ctnnb1 in the intestinal epithelia. Therefore, knockout of ieCtnnb1 leads to compromised expression of Ctnnb1 and, consequently, reduced Wnt signaling.  The term “undermined” is indeed too strong and has been revised to “compromised” in the revision (line 237). Similar revisions have been made throughout the manuscript. Particularly, the title was changed into “A Ctnnb1 enhancer transcriptionally regulates Wnt signaling dosage to balance homeostasis and tumorigenesis of intestinal epithelia”. However, as we state in the following point, decreased levels of β-catenin on ieCtnnb1 loss could lead to indirect effect, including the reduced expression of Bambi, which might cause a more significant decrease of nuclear β-catenin.

      (2) It is unclear how the reduction of Ctnnb1 mRNA caused by deletion of ieCtnnb1 in mice could lead to a preferential decrease of nuclear more than membranous β-catenin (Fig. 1K and L). This might reflect a general cell autonomous reduction in Wnt signaling activation; yet, it is not clear how this could occur. Do the authors have any explanations for this?

      It's a very important question. We observed that in inCtnnb1 knockout epithelia, the expression of Bambi (BMP and activin membrane-bound inhibitor) was significantly downregulated. Since BAMBI has been reported to stabilize β-catenin and facilitate its nuclear translocation, it is likely that the reduced level of BAMBI resulting from the loss of ieCtnnb1 further decreased nuclear βcatenin. In the revision, the expression change of Bambi has been added in Figure 1M. Moreover, the related content was extensively discussed with proper citations: “We noticed that after knocking out ieCtnnb1, the level of βcatenin in the nuclei of small intestinal crypt cells of Ctnnb1Δi.enh mice decreased more significantly compared to that in the cytoplasm (49.5% vs. 29.8%). Although the loss of ieCtnnb1 should not directly lead to reduced nuclear translocation of β-catenin, RNA-seq results showed that the loss of ieCtnnb1 causes a reduction in the expression of Bambi (BMP and activin membranebound inhibitor), a target gene in the canonical Wnt signaling pathway (Figure 1M). BAMBI promotes the binding of Frizzled to Dishevelled, thereby stabilizing β-catenin and facilitating its nuclear translocation (Lin et al., 2008; Liu et al., 2014; Mai et al., 2014; Zhang et al., 2015). Thus, it is likely that the decreased level of BAMBI resulting from the loss of ieCtnnb1 further reduced nuclear βcatenin”. 

      (3) In Figure 1 K-L the authors show β-catenin protein level. Why not show its mRNA?

      The mRNA levels of Ctnnb1 in small and large intestinal crypts were shown in Figure 1I and 1J, demonstrating reduced expression of Ctnnb1 upon ieCtnnb1 knockout. We hope the reviewer understands that it is unnecessary to measure the nuclear and cytosolic levels of Ctnnb1 transcripts, as the total mRNA level generally reflects the protein level. 

      (4) Concerning the GSEA of Figure 1 that includes the Wnt pathway components: a) it would be interesting to see which components and to what extent is their expression affected; b) why should the expression of Wnt components that are not Wnt target genes be affected in the first place? It is odd to see this described uncritically and used to support the idea of downregulated Wnt signaling.

      We appreciate the suggestion and apologize for any lack of clarity. The affected components of the Wnt signaling pathway and the extent of their changes are summarized in Figure 1 – figure supplement 3. Additionally, we have provided explanations for their downregulation. For instance, the reduced expression of Wnt3 and Wnt2b ligands in ieCtnnb1-KO crypts may be attributed to the decreased numbers of Paneth cells.  

      (5) In lines 251-252 the authors refer to "certain technical issues" in the isolation of cell type from the intestinal epithelium. Why this part should be obscure in the characterization of a tissue for which there are several established protocols of isolation and analysis is not clear. I would rather describe what these issues have been and how they protocol of isolation and analysis is not clear. I would rather describe what these issues have been and how they might have affected the data presented.

      We thank the reviewer for pointing this out. The single-cell preparation and sequencing of small intestinal cryptal epithelial cells were carried out largely according to reported protocols with slight modification. The enrichment of live crypt epithelial cells (EpCAM+DAPI-) by flow cytometry and cell filtering after single-cell sequencing were appropriate (Figure 2 – figure supplement 1A1C). We would like to emphasize a few points: 1) Unlike other protocols, we did not exclude immune cells, erythrocytes, or endothelial cells using negative sorting antibodies. 2) When defining cell populations, we focused exclusively on epithelial cell types and did not consider other cell types, such as immune cells. As a result, the so-called “undefined” cells include a mixture of nonepithelial cells. Indeed, markers for erythrocytes (AY036118/Erf1, PMID:12894589) and immune cells (Gm42418 and Lars2, PMID:30940803, PMID: 35659337) were the top three enriched genes in the “undefined” cluster (Figure 2 – figure supplement 1D). 3) Nonetheless, the overall findings remain robust, as key observations such as the loss of Paneth cells and reduced cell proliferation were validated through histological studies. This information has been incorporated into the revised manuscript with related references cited (lines 254-259). 

      (6) It is interesting that human SNPs exist that seem to fall within the ieCTNNB1 enhancer and affect the gastrointestinal expression of CTNNB1. Could the author report or investigate whether this SNP is present in human populations that have been considered in large-scale studies for colorectal cancer susceptibility? It seems to me a rather obvious next step of extreme importance to be ignored.

      (7) From Figure 5A a reader could conclude that colorectal tumor cells have a higher expression of CTNNB1 mRNA than in normal epithelium. This is the first time I have seen this observation which somewhat undermines our general understanding of Wnt-induced carcinogenesis exclusively initiated by APC mutations whereby it is β-catenin's protein level, not expression of its mRNA, of crucial importance. I find this to be potentially the most interesting observation of the current study, which could be linked to the activity of the enhancer discovered, and I suggest the authors elaborate more on this and perhaps consider it for future experimental follow-ups.

      We appreciate the comments and suggestions.  We therefore added related content in the revision (lines 470-475): “Importantly, ieCTNNB1 displayed higher enhancer activity in most CRC samples collected in the study. Moreover, the SNP rs15981379 (C>T) within ieCTNNB1 is associated with the expression of CTNNB1 in the GI tract. Future population studies could investigate how the enhancer activity of ieCTNNB1 and this particular SNP are associated with CRC susceptibility and prognosis”.

      (8) I am surprised that the authors, who seem to have dedicated lots of resources to this study, are satisfied by analyzing their ChIP experiments with qPCR rather than sequencing (Figure 6). ChIP-seq would produce a more reliable profile of the HNF4a and CREB1 binding sites on these loci and in other control regions, lending credibility to the whole experiment and binding site identification. Sequencing would also take care of the two following conceptual problems in primer design. 

      First: while the strategy to divide enhancer and promoter in 6 regions to improve the resolution of their finding is commendable, I wonder how the difference in signal reflects primers' efficiency rather than HNF4/CREB1 exact positioning. The possibility of distinguishing between regions 2 and 3, for example, in a ChIP-qPCR experiment, also depends on the average DNA fragment length after sonication, a parameter that is not specified here. 

      Second: what are the primers designed to detect the ieCtnnb1 enhancer amplifying in the yellow-columns samples of Figure 6G? In this sample, the enhancer is deleted, and no amplification should be possible, yet it seems that a value is obtained and set to 1 as a reference value.

      This is indeed a crucial point, and we fully agree with the reviewer that “ChIP-seq would produce a more reliable profile of the HNF4a and CREB1 binding sites on these loci and in other control regions”. However, we believe that our current ChIP-qPCR experiments have adequately addressed the potential concerns raised by the reviewers. (1) We have ensured that the DNA fragment length after sonication falls within the range of 200 bp to 500 bp, with an average length of approximately 300 bp (Author response image 1A). We have stated the point in the revised methods section (line 633). (2) We have randomly inspected 14 out of 26 primer sets used in Figure 6 and its supplemental figure (Author response image 1B-E), confirming that all primer sets demonstrate equal amplification efficiency (ranging from 90% to 110%). This information has also been included in the revised methods section (line 650). (3) Figures 6G and 6H show reduced enrichment of HNF4𝛼 (6G) and p-S133-CREB1 (6H) at the Ctnnb1 promoter in ieCtnnb1 knockout ApcMin/+ tumor tissues. The ChIP-qPCR primers used were positioned at the Ctnnb1 promoter, not at ieCtnnb1, with IgG control enrichment serving as the reference values on the Y-axes. 

      Author response image 1.

      (A) Agarose gel electrophoresis of sonicated DNA. (B-E) Tests of amplification efficiency for primer sets used in ChIP-qPCR.

      (9) The ChIP-qPCR showing preferential binding of pS133-CREB1 in small intestinal crypts and CHT15 cells (line 393) should be shown. 

      The ChIP-qPCR results demonstrating preferential binding of p-S133-

      CREB1 over CREB1 have been added in revised Figure 6C, 6D and Figure 6 – Supplement 1C.

      (10) It is not entirely clear what the blue tracks represent at the bottom of Figures 6C-D and Figure 6 - Figure Supplement 1C-D. The ChIP-seq profiles of both CREB1 and HNF4a shown in Figures 6A and Figure 6 - Figure Supplement 1A do not seem to match. Taking HNF4a, for example from Figure 6 - Figure Supplement 1A it seems to bind on the Ctnnb1 promoter, while in Figure 6 - Figure Supplement 1D the peaks are within the first intron. I realize this might all be a problem with a different scale across figure panels, but I suggest producing a cleared figure.

      We apologize for the confusion. We have revised Figure 6C-6D, Figure 6 - figure supplement 1C-D, and the corresponding legends to enhance clarity. (1) The top panels of Figures 6C and 6D respectively highlight shaded regions of ieCTNNB1 (pink) and the CTNNB1 promoter (grey) in Figure 6A, emphasizing the enrichment of p-S133-CREB1.  (2) The top panels of Figure 6 – figure supplement 1C and 1D respectively highlight shaded regions of ieCtnnb1 (pink) and the Ctnnb1 promoter (grey) in Figure 6A – figure supplement 1A, emphasizing the enrichment of HNF4α. (3) Because Figures 6C-6D and Figure 6 - figure supplement 1C-1D respectively correspond to human and mouse genomes, the positions of peaks and scales differ.  

      (11) In the intro the authors refer to "TCF-4". I suggest they use the more recent unambiguous nomenclature for this family of transcription factors and call it TCF7L2.

      TCF-4 has been changed into TCF7L2 in the revision (line 81)

      (12) In lines 121-122, the authors write "Although numerous putative enhancers...only a fraction of them were functionally annotated". To what study/studies are the authors referring? Please provide references.

      References were added in the revision (line 124)

      (13) In some parts the authors use strong words that should in my opinion be attenuated. Examples are: (i) at line 224, "maintains" would be better substituted with "contribute", as in the absence of ieCtnnb1, Ctnnb1 is still abundantly expressed; (ii) at line 266 "compromised" when the proliferative capacity of CFCs and TACs seems to be only mildly reduced; (iii) at line 286 "disrupts", the genes are simply downregulated.

      We thank these great suggestions. 1) On lines 224-225, the sentence was revised to: “These data suggest that ieCtnnb1 plays a specific role in regulating the transcription of Ctnnb1 in intestinal epithelia”. 2) On line 271, “compromised” were replaced with “mildly reduced”. 3) In ieCtnnb1 knockout epithelial cells of small intestine, genes related to secretory functions were decreased, while genes related to absorptive functions were increased. Therefore, the term 'disrupts' is more appropriate than 'downregulates'. 

      Reviewer #3:

      Line 81, c-Myc should be human MYC (italics) to agree with the other human gene names in this sentence. 

      c-Myc has been changed into MYC in the revision (line 82)

      Line 215, wildtype should be wild-type. 

      “wildtype” has been changed into “wild-type” in the revision (line 215)

      Line 224, Elimination of the enhancer did not abolish expression of Ctnnb1; therefore, it would be better to say that it "helps to maintain Ctnnb1 transcription" 

      The sentence was changed into “These data suggest that ieCtnnb1 plays a specific role in regulating the transcription of Ctnnb1 in intestinal epithelia” in revision (lines 224-225)

      Line 228, perhaps "to activate transcription" is meant. 

      “active” has been changed into “activate” in the revision (line 228)

      Line 235, consider "reduced" instead of "undermined". 

      “undermined” has been replaced with “compromised” in the revision (line 237)

      Line 262, "em" dashes should be a both ends of this insertion. 

      Line 298, "dysfunctional" would be better.

      Line 356, "samples were". 

      Line 481, 12-hr (add hyphen). 

      All above points have been optimized according to the reviewer’s suggestion.

      Line 712, Is "poly-N" meant? 

      “Poly-N” indicates undetected bases during sequencing. This explanation was added in the revision (lines 759-760).

      Figure 1K, the GAPDH signal is not visible and that panel is unnecessary as there is an H3 control.   

      Figure 1K and 1L respectively show levels of nuclear and cytoplasmic βcatenin. GAPDH and H3 were used as internal references for the cytoplasmic and nuclear fractions, respectively, confirming both robust fractionation and equal loading.

    1. eLife assessment

      This valuable paper reports a theoretical framework and methodology for identifying Cancer Driving Nucleotides (CDNs), primarily based on single nucleotide variant (SNV) frequencies. A variety of solid approaches indicate that a mutation recurring three or more times is more likely to reflect selection rather than being the consequence of a mutation hotspot. The method is rigorously quantitative, though the requirement for larger datasets to fully identify all CDNs remains a noted limitation. The work will be of broad interest to cancer geneticists and evolutionary biologists.

    2. Reviewer #1 (Public Review):

      The authors developed a rigorous methodology for identifying all Cancer Driving Nucleotides (CDNs) by leveraging the concept of massively repeated evolution in cancer. By focusing on mutations that recur frequently in pan-cancer, they aimed to differentiate between true driver mutations and neutral mutations, ultimately enhancing the understanding of the mutational landscape that drives tumorigenesis. Their goal was to call a comprehensive catalogue of CDNs to inform more effective targeted therapies and address issues such as drug resistance.

      Strengths

      (1) The authors introduced a concept of using massively repeated evolution to identify CDNs. This approach recognizes that advantageous mutations recur frequently (at least 3 times) across cancer patients, providing a lens to identify true cancer drivers.

      (2) The theory showed the feasibility of identifying almost all CDNs if the number of sequenced patients increases to 100,000 for each cancer type.

      Weaknesses

      (1) The methodology remains theoretical and no novel true driver mutations were identified in this study.

      (2) Different cancer types have unique mutational landscapes. The methodology, while robust, might face challenges in uniformly identifying CDNs across various cancers with distinct genetic and epigenetic contexts.

      (3) L223, the statement "In other words, the sequences surrounding the high-recurrence sites appear rather random.". Since it was a pan-cancer analysis, the unique patterns of each cancer type could be strongly diluted in the pan-cancer data.

      (4) To solidify the findings, the results need to be replicated in an independent dataset.

      (5) The key scripts and the list of key results (i.e., CDN sites with i{greater than or equal to}3) need to be shared to enable replication, validation, and further research. So far, only CDN sites with i{greater than or equal to}20 have been shared.

      (6) The versions of data used in this study are not clearly detailed, such as the specific version of gnomAD and the version and date of TCGA data downloaded from the GDC Data Portal.

    3. Reviewer #2 (Public Review):

      Summary:

      The authors propose that cancer-driver mutations can be identified by Cancer Driving Nucleotides (CDNs). CDNs are defined as SNVs that occur frequently in genes. There are many ways to define cancer driver mutations, and the strengths and weaknesses are the reliance on statistics to define them.

      Strengths:

      There are many well-known approaches and studies that have already identified many canonical driver mutations. A potential strength is that mutation frequencies may be able to identify as yet unrecognized driver mutations. They use a previously developed method to estimate mutation hotspots across the genome (Dig, Sherman et al 2022). This publication has already used cancer sequence data to infer driver mutations based on higher-than-expected mutation frequencies. The advance here is to further illustrate that recurrent mutations (estimated at 3 or more mutations (CDNs) at the same base) are more likely to be the result of selection for a driver mutation (Figure 3). Further analysis indicates that mutation sequence context (Figure 4) or mutation mechanisms (Figure 5) are unlikely to be major causes for recurrent point mutations. Finally, they calculate (Figure 6) that most driver mutations identifiable by the CDN approach could be identified with about 100,000 to one million tumor coding genomes.

      Weaknesses:

      The manuscript does provide specific examples where recurrent mutations identify known driver mutations but do not identify "new" candidate driver mutations. Driver mutation validation is difficult and at least clinically, frequency (ie observed in multiple other cancer samples) is indeed commonly used to judge if an SNV has driver potential. The method would miss alternative ways to trigger driver alterations (translocations, indels, epigenetic, CNVs). Nevertheless, the value of the manuscript is its quantitative analysis of why mutation frequencies can identify cancer driver mutations.

    4. Author response:

      We are grateful to the reviewers and editors for their insightful comments. All recognized that, while mutation recurrences have been used for inferring cancer drivers, our approach has the rigor of quantitative analysis. We would like to add that, without rigorously ruling out mutational hotspots, most CDNs have not been accepted as driver mutations.

      This paper develops the theory stating that i) recurrent point mutations are true Cancer Driving Nucleotides (CDNs); and ii) non-recurrent mutations are unlikely to be CDNs. The reviewers question that, with the theory, we still have not discovered new driving mutations. This is done in the companion paper. Table 3 shows that, averaged across cancer types, the conventional method would identify 45 CDGs while the CDN method tallies 258 CDGs. The power of the CDN method in identifying new driver genes is evident.

      The second question is "By this theory, will we be able discover most CDNs when the sample size increases from ~ 1000 to 10,000?"  This is a question of forecast and can be partially answered using GENIE data. Fig. 7 of this study shows that, when n increases from ~ 1000 to ~ 9,000, the numbers of discovered CDNs increase by 3 – 5 fold, most of which come from the two-hit class, as expected.

      Fig. 7 also addresses the queries whether we have used datasets other than TCGA. We indeed have used all public data, including GENIE, ICGC and other integrated resources such as COSMIC. For the main study, we rely on TCGA because it is unbiased for estimating the probability of CDN occurrences. In many datasets, the numerators are given but the denominators are not (the number of patients with the mutation / the total number of patients surveyed). 

      The third question is about mutation recurrences among cancer types. As stated by one reviewer, "different cancer types have unique mutational landscapes". While this is true when the analysis is done at the whole-gene level, one gets a different picture at the nucleotide level where the resolution is much higher. The pan-cancer trend of point mutations is evident in Fig. 4 of the companion paper.

      Again, we heartily appreciate the criticisms and suggestions of the reviewers and editors!

    1. eLife assessment

      Glioblastoma is one of the most aggressive cancers without a cure. Glioblastoma cells are known to have high mitochondrial potential. This useful study demonstrates the critical role of the ribosome-associated quality control (RQC) pathway in regulating mitochondrial membrane potential and glioblastoma growth. Some assays are incomplete; further revision will improve the significance of this study.

    2. Reviewer #1 (Public Review):

      Summary:

      Cai et al have investigated the role of msiCAT-tailed mitochondrial proteins that frequently exist in glioblastoma stem cells. Overexpression of msiCAT-tailed mitochondrial ATP synthase F1 subunit alpha (ATP5) protein increases the mitochondrial membrane potential and blocks mitochondrial permeability transition pore formation/opening. These changes in mitochondrial properties provide resistance to staurosporine (STS)-induced apoptosis in GBM cells. Therefore, msiCAT-tailing can promote cell survival and migration, while genetic and pharmacological inhibition of msiCAT-tailing can prevent the overgrowth of GBM cells.

      Strengths:

      The CAT-tailing concept has not been explored in cancer settings. Therefore, the present provides new insights for widening the therapeutic avenue.

      Weaknesses:

      Although the paper does have strengths in principle, the weaknesses of the paper are that these strengths are not directly demonstrated. The conclusions of this paper are mostly well-supported by data, but some aspects of image acquisition and data analysis need to be clarified and extended.

    3. Reviewer #2 (Public Review):

      This work explores the connection between glioblastoma, mito-RQC, and msiCAT-tailing. They build upon previous work concluding that ATP5alpha is CAT-tailed and explore how CAT-tailing may affect cell physiology and sensitivity to chemotherapy. The authors conclude that when ATP5alpha is CAT-tailed, it either incorporates into the proton pump or aggregates and that these events dysregulate MPTP opening and mitochondrial membrane potential and that this regulates drug sensitivity. This work includes several intriguing and novel observations connecting cell physiology, RQC, and drug sensitivity. This is also the first time this reviewer has seen an investigation of how a CAT tail may specifically affect the function of a protein. However, some of the conclusions in this work are not well supported. This significantly weakens the work but can be addressed through further experiments or by weakening the text.

    4. Author response:

      We are grateful for the reviewers' acknowledgment of the originality of our manuscript and its potential importance in cancer treatment. We appreciate the reviewers' critiques on certain conclusions and thank them for their thorough feedback on the manuscript. In the revised version, we will provide a more detailed clarification of the previous data and methods, bolster the existing data, and present additional evidence in support of our hypothesis. Please find below our replies to particular concerns.

      In brief, to address the comments from Reviewer 1, we will make the following revisions in the manuscript:

      (1) To discuss the issues regarding the specificity of ATP5⍺ CAT-tailing, we will provide new patient-derived cell lines and tumor samples and investigate the CAT-tail modifications of nuclear genome-encoded mitochondrial proteins and changes in RQC proteins within them. We will endeavor to explore the nature of NEMF modifications in GSC cells (Fig. S1A).

      (2) To enhance the quality of image data, we will substitute some images (such as Fig. 1E and 3A) with higher quality images.

      (3) To further understand the influence of NEMF on cancer, the effects of NEMF overexpression in GSC cells will be evaluated through testing (e.g., Fig. 3D).

      (4) To further explore changes in apoptosis, we will employ additional methods to detect apoptosis, including Annexin-PI FACS assays, caspase cleavage analysis, assessing BAX-BCL2 ratios, and monitoring cytochrome c release.

      (5) To further confirm the effectiveness of the CAT-tailing-mitochondria mechanism in in vivo tumor models, we will utilize a Drosophila model to study the impact of the RQC pathway and CAT-tailing mechanism on tumor proliferation in vivo. The overactivation of the Notch signaling pathway in Drosophila can stimulate malignant proliferation of neural stem cells (NSCs) through both canonical (c-Myc mediated pathway) and non-canonical (PINK1-mitochondrial-mTORC2 pathway) pathways, leading to the development of a tumor-like phenotype in the larval brain. A recent publication in PNAS Nexus (Khaket et al., PNAS Nexus, 2024) discusses the impact of the RQC pathway on c-Myc. It is possible for us to analyze the alterations in CAT-tailing on mitochondrial proteins and mitochondrial membrane potential in this Notch model and study how the RQC pathway regulates them. Moreover, tumor implantation experiments will be carried out using immunodeficient mice. Our goal is to conduct a comparative analysis of the growth of control and NEMF KD glioblastoma cell lines in animal models, alongside performing essential biochemical analyses.

      Reference:

      Khaket, T. P., et al. (2024). Ribosome stalling during c-myc translation presents actionable cancer cell vulnerability. PNAS nexus, 3(8), pgae321.

      To address the comments from Reviewer 2, we will make the following revisions in the manuscript:

      (1) The concerns raised by the reviewer regarding the authenticity of the ATP5a CAT-tail modification are duly noted. Critical control experiments will be incorporated into our study, including NEMF knockout (or NFACT domain mutants) and cycloheximide treatment, alongside other methodologies. The results of these experiments will include placements such as Fig. 1B, 1C, S3A, and S3B to improve comprehension of the CAT-tail modification on ATP5⍺.

      (2) We thank the reviewer for reminding us to consider the differences between the artificial tail and the endogenous CAT-tail. A recently published study (Khan et al., 2024) provides a thorough analysis of the components of the CAT-tail. Our approach to addressing this issue involves emphasizing the use of the artificial CAT-tail sequence and adopting a more measured tone in the revised version. Additionally, we will induce the endogenous ATP5⍺-CAT-tail by express ATP5⍺-K20-non-stop in cells to validate their function in glioblastoma cells.

      (3) Moreover, we aim to examine the impact of different amino acid compositions in the ATP5⍺ c-terminus extension, such as the poly (Gly-Ser) repeats noted by the reviewer, on both mitochondrial function and glioblastoma biology in our revision. By comparing the results obtained from ATP5⍺-CAT-tails with different compositions, it is anticipated that more definitive conclusions can be drawn.

      (4) Additional minor revisions will be implemented to the text in accordance with the feedback given by the reviewer.

      Reference:

      Khan, D., Vinayak, A. A., Sitron, C. S., & Brandman, O. (2024). Mechanochemical forces regulate the composition and function of CAT tails. bioRxiv, 2024-08.

    1. eLife assessment

      This important study by Lee et al. investigates the heterogeneous response of non-growing bacteria to the antimicrobial peptide (AMP) tachyplesin. In this response, a subpopulation of bacteria limits the accumulation of a fluorescent analog of the AMP, avoiding lethal damage. The study provides compelling data showing the differential accumulation of AMP in subpopulations and its correlation with antimicrobial efficacy. However, the evidence for increased efflux as the main survival mechanism remains incomplete.

    2. Reviewer #1 (Public Review):

      Summary:

      This work contributes several important and interesting observations regarding the heterotolerance of non-growing Escherichia coli and Pseudomonas aeruginosa to the antimicrobial peptide tachyplesin. The primary mechanism of action of tachyplesin is thought to be disruption of the bacterial cell envelope, leading to leakage of cellular contents after a threshold level of accumulation. Although the MIC for tachyplesin in exponentially growing E. coli is just 1 ug/ml, the authors observe that a substantial fraction of a stationary phase population of bacteria survive much higher concentrations, up to 64 ug/ml. By using a fluorescently-labelled analogue of tachyplesin, the authors show that the amount of per-cell intracellular accumulation of tachyplesin displays a bimodal distribution and that the fraction of "low accumulators" correlates with the fraction of survivors.

      Using a microfluidic device, they show that low accumulators exclude propidium iodide, suggesting that their cell envelopes remain largely intact, while high accumulators of tachyplesin also stain with propidium iodide. They show that this phenomenon holds for several clinical isolates of E. coli with different genetic determinants of antibiotic resistance, and for a strain of Pseudomonas aeruginosa. However, the bimodal distribution does not occur in these organisms for several other antimicrobial peptides, or for tachyplesin in Klebsiella pneumoniae or Staphylococcus aureus, indicating some degree of specificity in the interaction between AMP and bacterial cell envelope. They next explore the dynamics of the fluorescent tachyplesin accumulation and show interestingly that a high degree of accumulation is initially seen in all cells, but that the "low accumulator" subpopulation manages to decrease the amount of intracellular fluorescence over time, while the "high accumulator" subpopulation continues to increase its intracellular fluorescence. Focusing on increased efflux as a hypothesised mechanism for the "low accumulator" phenotype, based on transcriptomic analysis of the two subpopulations, the authors screen putative efflux inhibitors to see if they can block the formation of the low accumulator subpopulation. They find that both the protonophore CCCP and the SSRI sertraline can block the formation of this subpopulation and that a combination of sertraline plus tachyplesin kills a greater fraction of the stationary phase cells than either agent alone, similar to the killing observed when growing cells are treated with tachyplesin.

      Strengths:

      This study provides new insight into the heterogeneous behaviours of non-growing bacteria when exposed to an antimicrobial peptide, and into the dynamics of their response. The single-cell analysis by FACS and microscopy is compelling. The results provide a much-needed single-cell perspective on the phenomenon of tolerance to AMPs and a good starting point for further exploration.

      Weaknesses:

      My main concerns surround the conclusions drawn about the physiological underpinnings of these behaviours, based in part on transcriptomic analysis and also on the observation of the dynamics. I think deeper consideration of the relative contributions of influx and efflux to the observed accumulation dynamics, and the slow/non-growing context of the observations would be helpful. In particular, these issues seem important:

      (1) The initial high accumulation by all cells followed by the emergence of a sub-population that has reduced its intracellular levels of tachyplesin is a key observation and I agree with the authors' conclusion that this suggests an induced response to the AMP is important in facilitating the bimodal distribution. However, I think the conclusion that upregulated efflux is driving the reduction in signal in the "low accumulator" subpopulation is not fully supported. Steady-state amounts of intracellular fluorescent AMP are determined by the relative rates of influx and efflux and a decrease could be caused by decreasing influx (while efflux remained unchanged), increasing efflux (while influx remained unchanged), or both decreasing influx and increasing efflux. Given the transcriptomic data suggest possible changes in the expression of enzymes that could affect outer membrane permeability and outer membrane vesicle formation as well as efflux, it seems very possible that changes to both influx and efflux are important. The "efflux inhibitors" shown to block the formation of the low accumulator subpopulation have highly pleiotropic or incompletely characterised mechanisms of action so they also do not exclusively support a hypothesis of increased efflux.

      (2) A conclusion of the transcriptomic analysis is that the lower accumulating subpopulation was exhibiting "a less translationally and metabolically active state" based on less upregulation of a cluster of genes including those involved in transcription and translation. This conclusion seems to borrow from well-described relationships referred to as bacterial growth laws in which the expression of genes involved in ribosome production and translation is directly related to the bacterial growth (and metabolic) rate. However, the assumptions that allow the formulation of the bacterial growth laws (balanced, steady state, exponential growth) do not hold in growth arrest. A non-growing cell could express no genes at all or could express ribosomal genes at a very low level, or efflux pumps at a high level. The distribution of transcripts among the functional classes of genes does not reveal anything about metabolic rates within the context of growth arrest - it only allows insight into metabolic rates when the constraint of exponential growth can be assumed. Efflux pumps can be highly metabolically costly; for example, Tn-Seq experiments have repeatedly shown that mutants for efflux pump gene transcriptional repressors have strong fitness disadvantages in energy-limited conditions. There are no data presented here to disprove a hypothesis that the low accumulators have high metabolic rates but allocate all of their metabolic resources to fortifying their outer membranes and upregulating efflux. This could be an important distinction for understanding the vulnerabilities of this subpopulation. Metabolic rates can be more directly estimated for single cells using respiratory dyes or pulsed metabolic labelling, for example, and these data could allow deeper insight into the metabolic rates of the two subpopulations.

      The observation that adding nutrients to the stationary phase cultures pushes most of the cells to the "high accumulator" state is presented as support of the hypothesis that the high accumulator state is a higher metabolism/higher translational activity state. However, it is important to note that adding nutrients will cause most or all of the cells in the population to start to grow, thus re-entering the familiar regime in which bacterial growth laws apply. This is evident in the slightly larger cell sizes seen in the nutrient-amended condition. In contrast to stationary phase cells, growing cells largely do not exhibit the bimodal distribution, and they are much more sensitive to tachyplesin, as demonstrated clearly in the supplement. Growing cells are not necessarily the same as the high-accumulating subpopulation of non-growing cells.

      It might also be worth adding some additional context around the potential to employ efflux inhibitors as therapeutics. It is very clear that obtaining sufficient antimicrobial drug accumulation within Gram-negative bacteria is a substantial barrier to effective treatments, and large concerted efforts to find and develop therapeutic efflux pump inhibitors have been undertaken repeatedly over the last 25 years. Sufficiently selective inhibitors of bacterial efflux pumps with appropriate drug-like properties have been challenging to find and none have entered clinical trials. Multiple psychoactive drugs have been shown to impact efflux in bacteria but usually using concentrations in the 10-100 uM range (as here). Meanwhile, the Ki values for their human targets are usually in the sub- to low-nanomolar range. The authors rightly note that the concentration of sertraline they have used is higher than that achieved in patients, but this is by many orders of magnitude, and it might be worth expanding a bit on the substantial challenge of finding efflux inhibitors that would be specific and non-toxic enough to be used therapeutically. Many advances in structural biology, molecular dynamics, and medicinal chemistry may make the quest for therapeutic efflux inhibitors more fruitful than it has been in the past but it is likely to remain a substantial challenge.

    3. Reviewer #2 (Public Review):

      Summary:

      This study reports on the existence of subpopulations of isogenic E. coli and P. aeruginosa cells that are tolerant to the antimicrobial peptide tachyplesin and are characterized by the accumulation of low levels of a fluorescent tachyplesin-NBD conjugate. The authors then set out to address the molecular mechanisms, providing interesting insights even though the mechanism remains incompletely defined: The work suggests that amongst others changes in membrane lipid composition and increased drug efflux may cause this phenotype and it demonstrates that pharmacological manipulation can prevent generation of tolerance. The authors are cautious in their interpretation and the claims made are largely justified by the data.

      Strengths:

      Going beyond the commonly used bulk techniques for studying susceptibility to AMPs , Lee et al. used fluorescent antibiotic conjugates in combination with flow cytometry analysis to study variability in drug accumulation at the single-cell level. This powerful approach enabled the authors to expose bimodal drug accumulation patterns that were condition-dependent, but conserved across a variety of E. coli clinical isolates. Using cell sorting in combination with colony-forming unit assays as well as quantitative fluorescence microscopic analysis in a microfluidics setup the authors compellingly demonstrate that low accumulators (where the fluorescence signal is mostly restricted to the membrane), can survive antibiotic treatment, whereas high accumulators (with high intracellular fluorescence) were killed. Comparative transcriptomics analysis of sorted ´low accumulator´ and ´high accumulator´ subpopulations suggest that changes in the lipid composition, increased efflux, and other mechanisms may contribute to tachyplesin-tolerance in this subpopulation. Lipidomics analysis of bulk untreated vs. tachyplesin-NBD treated cells confirmed changes in the lipid composition in accordance with the transcriptomics data. Intriguingly, a time-course experiment on tachyplesin-NBD accumulation revealed that all cells initially were high accumulators, before a subpopulation of cells subsequently managed to reduce the signal intensity (most likely through efflux), demonstrating that the ´low accumulator´ phenotype is an induced response and not a pre-existing property.

      Finally, the demonstration that treatment with efflux pump inhibitors (although some caution needs to be taken regarding the selectivity of these inhibitors, see comment on weaknesses below) prevents the generation of low accumulators and enhances tachyplesin-based killing is an important basis for developing combination therapies.

      The study convincingly illustrates how susceptibility to tachoplesin adaptively changes in a heterogeneous way dependent on the growth phases/ environments and availability of nutrients. This is highly relevant also beyond the presented example of tachyplesin and similar subpopulation-based adaptive changes to the susceptibility towards antimicrobial peptides or other drugs that may occur during infections in vivo and they would likely be missed out by standardized in vitro susceptibility testing.

      Weaknesses:

      Some questions regarding the mechanism remain. One shortcoming of the setup of the transcriptomics experiment is that the tachyplesin-NBD probe itself has antibiotic efficacy and induces phenotypes (and eventually cell death) in the ´high accumulator´cells. This makes it challenging to interpret whether any differences seen between the two groups are causative for the observed accumulation pattern or if they are a consequence of differential accumulation and downstream phenotypic effects. The role of efflux systems is further supported by the finding that efflux pump inhibitors sensitize E. coli to tachyplesin and prevent the occurrence of the tolerant ´low accumulator´ subpopulations. In principle, this is a great way of validating the role of efflux pumps, but the limited selectivity of these inhibitors (CCCP is an uncoupling agent, and for sertraline direct antimicrobial effects on E. coli have been reported by Bohnert et al.) leaves some ambiguity as to whether the synergistic effect is truly mediated via efflux pump inhibition. It would be relevant to test and report the MIC of sertraline for the strain tested, particularly since in Figure 4G an initial reduction in CFUs is observed for sertraline treatment, which suggests the existence of biological effects in addition to efflux inhibition.

    4. Reviewer #3 (Public Review):

      Summary:

      The study tests the phenotypic response of bacteria (mainly E. coli) to antimicrobial peptides (AMPs) such as tachyplesin. The resistance mechanisms to AMPs differ from those to classical antibiotics in that AMP resistance involves more non-genetic mechanisms, which are largely unknown but are important to understand. This work aims to elucidate the mechanism of such phenotypic resistance.

      Strengths:

      The experiments unambiguously reveal that the cells respond to stress heterogeneously, with two distinct subpopulations - one with better survival than the other. This primary phenotype is convincingly shown across various E. coli strains, including clinical isolates.

      Weaknesses:

      The authors' claims about high efflux being the main mechanism of survival are unconvincing, given the current data. There can be several alternative hypotheses that could explain their results, such as lower binding of the AMP, lower rate of internalization, metabolic inactivity, etc. It is unclear how efflux can be important for survival against a peptide that the authors claim binds externally to the cell. The addition of efflux assays would be beneficial for clear interpretations. Further genetic experiments are necessary to test whether efflux genes are involved at all.

    1. eLife assessment

      This study provides valuable insights, addressing the growing threat of multi-drug-resistant (MDR) pathogens by focusing on the enhanced efficacy of colistin when combined with artesunate and EDTA against colistin-resistant Salmonella strains. The evidence is solid, supported by comprehensive microbiological assays, molecular analyses, and in vivo experiments demonstrating the effectiveness of this synergic combination. However, the discussion on the clinical application challenges of this triple combination is incomplete, and it would benefit from addressing the high risk associated with using three potential nephrotoxic agents in vivo.

    2. Reviewer #1 (Public Review):

      Summary:

      The study addresses the growing threat of multi-drug-resistant (MDR) pathogens, focusing on the efficacy of colistin (COL), a last-resort antibiotic, and its enhanced activity when combined with artesunate (AS) and ethylenediaminetetraacetic acid (EDTA) against colistin-resistant Salmonella strains. The researchers aim to explore whether these combinations can restore the effectiveness of colistin and understand the underlying mechanisms. The study used a combination of microbiological and molecular techniques to evaluate the antibacterial activity and mechanisms of action of COL, AS, and EDTA.

      Key methods include:

      (1) Antimicrobial Susceptibility Testing: Determining minimum inhibitory concentrations (MICs) of COL, AS, and EDTA, both alone and in combination, against various Salmonella strains;

      (2) Time-Kill Assays: Measuring bacterial growth inhibition over time with different drug combinations;

      (3) Fluorescent Probe-Permeability Assays: Assessing cell membrane integrity using fluorescent dyes;

      (4) Proton Motive Force Assay: Evaluating the impact on the electrochemical proton gradient (PMF);

      (5) Reactive Oxygen Species (ROS) Measurement: Quantifying intracellular ROS levels; (vi) Scanning Electron Microscopy (SEM): Observing morphological changes in bacterial cells; and

      (6) Omics Analysis: Transcriptome and metabolome profiling to identify differentially expressed genes (DEGs) and significant differential metabolites (SDMs).

      The combination of COL, AS, and EDTA (AEC) showed significant antibacterial activity against colistin-resistant Salmonella strains, reducing the MICs and enhancing bacterial killing compared to individual treatments. The AEC treatment caused extensive damage to both the outer and inner bacterial membranes, as evidenced by increased fluorescence of membrane-impermeant dyes and SEM images showing deformed cell membranes. AEC treatment selectively collapsed the Δψ component of PMF, indicating disruption of vital cellular processes. The combination therapy increased intracellular ROS levels, contributing to bacterial killing. Transcriptome data revealed changes in genes related to two-component systems, flagellar assembly, and ABC transporters. Metabolome analysis highlighted disruptions in pathways such as arachidonic acid metabolism. The findings suggest that AS and EDTA can potentiate the antibacterial effects of colistin by disrupting bacterial membranes, collapsing PMF, and increasing ROS levels. This combination therapy could serve as a promising approach to combat colistin-resistant Salmonella infections.

      Strengths:

      (1) The study employs a wide range of techniques to thoroughly investigate the antibacterial mechanisms and efficacy of the drug combinations.

      (2) The results are consistent across multiple assays and supported by both in vitro and in vivo data.

      (3) Combining AS and EDTA with COL represents a novel strategy to tackle antibiotic resistance.

      Weaknesses:

      (1) The study focuses on a limited number of Salmonella strains, and broader testing on various MDR pathogens would strengthen the findings.

      (2) While the study elucidates several mechanisms, further molecular details could provide deeper insights into the interactions between these drugs and bacterial targets.

      (3) The time-kill experiment was conducted over 12 hours instead of the recommended 24 hours. To demonstrate a synergistic effect among the drugs, a reduction of at least 2 log10 in colony count should be shown in a 24-hour experiment. Additionally, clarifying the criteria for selecting drug concentrations is important to improve the interpretation of the results.

      (4) While the combination of EDTA, artesunate, and colistin shows promising in vitro results against Salmonella strains, the clinical application of this combination warrants careful consideration due to potential toxicity issues associated with these compounds.

    3. Reviewer #2 (Public Review):

      Summary:

      The study by Zhai et al describes repurposing of artesunate, to be used in combination with EDTA to resensitize Salmonella spp. to colistin. The observed effect applied both to strains with and without mobile colistin resistance determinants (MCR). It was already known that EDTA in combination with colistin has an inhibitory effect on MCR-enzymes, but at the same time, both colistin and EDTA can contribute to nephrotoxicity, something which is also true for artesunate. Thus, the triple combination of three nephrotoxic agents has significant challenges in vivo, which is not particularly discussed in this paper.

      Strengths:

      The study is sound from a methodological point of view and has many interesting angles to address mechanistically how the three compounds can synergize.

      Weaknesses:

      (1) The selection of strains is not very clear. Nothing is known about the sequence types of the strains or how representative they are for strains circulating in general. Thus, it is difficult to generalize from this limited number of isolates, although the studies done in these isolates are comprehensive.

      (2) Nothing is known about the susceptibility of the strains to other novel antimicrobial agents. Colistin has a limited role in the treatment of gram-negative infections, and although it can be used sometimes in combination, it is not clear why it would be combined with two other nephrotoxic agents and how this could have relevance in a clinical setting.

      (3) It is not clear whether their transcriptomics analysis should at least be carried out in duplicate for reasons of being able to assess reproducibility. It is also not clear why the samples were incubated for 6 hours - no discussion is presented on the selection of a time point for this.

      (4) Discussion is lacking on the reproducibility and selection of details for the methodology.

    4. Reviewer #3 Public Review):

      Summary:

      The authors have studied the combination of three compounds, artesunate, EDTA, and colistin, to improve the activity of colistin instead of artesunate and colistin, which is weakly active. The three compounds appeared to possess activity against macr1 Salmonella both in vitro and in vivo.

      Strengths:

      A strong panel of experiments has been carried out.

      Weaknesses:

      (1) Number of strains tested.

      (2) Lack of data on cytotoxicity.

    1. eLife assessment

      This study presents valuable findings on how mitochondrial transplantation affects post-cardiac arrest myocardial dysfunction (PAMD). The authors demonstrate that mitochondrial transplantation enhances cardiac function, increases survival rates after the return of spontaneous circulation (ROSC). While the findings are promising, the organization of the paper, along with the analysis and interpretation of the results, are inadequate and need revision.

    2. Reviewer #1 (Public Review):

      Summary:

      In this study, the authors investigate the effect of mitochondrial transplantation on post-cardiac arrest myocardial dysfunction (PAMD), which is associated with mitochondrial dysfunction. The authors demonstrate that mitochondrial transplantation enhances cardiac function and increases survival rates after the return of spontaneous circulation (ROSC). Mechanistically, they found that myocardial tissues with transplanted mitochondria exhibit increased mitochondrial complex activity, higher ATP levels, reduced cardiomyocyte apoptosis, and lower myocardial oxidative stress post-ROSC.

      Strengths:

      Previous studies have reported that mitochondrial transplantation can improve myocardial recovery after regional ischemia, but its potential for treating myocardial injury following cardiac arrest has not been tested yet. Therefore, the findings are somewhat novel. Remarkably, the increased survival in mitochondria treated group post-ROSC is very promising and highlights its translational potential.

      Weaknesses:

      The organization of the paper, along with the analysis and interpretation of the results, requires significant revision.

    3. Reviewer #2 (Public Review):

      Summary:

      The authors address an important question in cardiovascular science that is very topical. The use of exogenous mitochondrial transplantation is assessed after cardiac arrest to determine if these exogenous mitochondria can enhance cardiac function. Given the role of mitochondria in the energy expenditure of the heart, this is an important question to study.

      Strengths:

      The strength lies mainly in the hypothesis being addressed as it is highly relevant in the quest for more strategies to enhance cardiac function.

      Weaknesses:

      There is further refinement needed in experimental details and transparency. Also, additional experiments need to be performed such as the seahorse experiment for oxygen consumption. Improvements in the text and in figures are needed and these comments are directed to the authors in our recommendations to the authors.

    4. Reviewer #3 (Public Review):

      In this manuscript titled "Transplantation of exogenous mitochondria mitigates myocardial dysfunction after cardiac arrest", Zhen Wang et al. report that exogenous mitochondrial transplantation can enhance myocardial function and survival rates. It limits mitochondrial morphology impairment, boosts complexes II and IV activity, and increases ATP levels. Additionally, mitochondrial therapy reduces oxidative stress, lessens myocardial injury, and improves PAMD after cardiopulmonary resuscitation. The results of this manuscript clearly demonstrate that mitochondrial transplantation can effectively improve PAMD after cardiopulmonary resuscitation, highlighting its significant scientific and clinical value. The findings shown in this manuscript are interesting to the readers. However, further experiments are needed to confirm this conclusion. In addition, the results should be rewritten to describe and discuss the relevant data in detail.

      Major comments:

      (1) Can isolated mitochondria be transported to cultured cardiomyocytes, such as H9C2 cells, in vitro?

      (2) The description of results in the manuscript is too simple. It lacks detail on the rationale behind the experiments and the significance of the data.

      (3) The authors demonstrate that mitochondrial transplantation reduces cardiomyocyte apoptosis. Therefore, Western blot analysis of apoptosis-related caspases could be provided for further confirmation.

      (4) Do donor mitochondria fuse with recipient mitochondria? Relevant experiments and data should be provided to address this question.

      (5) In Figure 5A, the histograms are not labeled with the specific experimental groups.

  2. Aug 2024
    1. eLife assessment

      This study provides an important advance in the molecular understanding of the lipopolysaccharide export mechanism and machinery in bacteria. By using advanced spectroscopy approaches, the experiments provide convincing biophysical support for the dynamic behavior of the multisubunit Lpt transport system. This work has implications for understanding bacterial cell envelope biogenesis and developing drugs that target Gram negative pathogens.

    2. Reviewer #1 (Public Review):

      Summary:

      The current manuscript uses electron spin resonance spectroscopy to understand how the dynamic behavior and conformational heterogeneity of the LPS transport system change during substrate transport and in response to the membrane, bound nucleotide (or transition state analog) and accessory subunits. The study builds on prior structural studies to expand our molecular understanding of this highly significant bacterial transport system.

      Strengths

      This series of well-designed and well-executed experiments provide new mechanistic insights into the dynamic behavior of the LPS transport system. Notable new insights provided by this study include its indication of the spatial organization of the LptC domain, which was poorly resolved in structures, and how the LptC domain modulates the dynamic behavior of the gate through which lipids access the binding site. In addition, a mass spectrometry approach designed to examine LPS binding at different stages in the nucleotide-dependent conformational cycle provides insight into the order of operations of LPS binding and transport.

    3. Reviewer #2 (Public Review):

      Lipopolysaccharide (LPS) is a major component of the outer membrane of Gram-negative bacteria and plays a critical role in bacterial virulence. The LPS export mechanism is a potential target for new antibiotics. Inhibiting this process can render bacteria more susceptible to the host immune system or other antibacterial agents. Given the rise of antibiotic-resistant bacteria, novel targets are urgently needed. The seven LPS transport (Lpt) proteins, A-G, move LPS from the inner to the outer membrane. This study investigated the conformational changes in the LptB2FG-LptC complex using site-directed spin labeling (SDSL) electron paramagnetic resonance (EPR) spectroscopy, revealing how ATP binding and hydrolysis affect the LptF β-jellyroll domain and lateral gates. The findings highlight the role of LptC in regulating LPS entry, ensuring efficient and unidirectional transport across the periplasm.

      The β-jellyrolls are not fully resolved in the vanadate-trapped structure of LptB2FG and LptB2FGC. Therefore, the current study provides valuable information on the functional dynamics of these periplasmic domains, their interactions, and their roles in the unidirectional transport of LPS. Additionally, the dynamic perspective of the lateral gates in LptFG in the presence and absence of LptC is another strength of this study. Moreover, at least in detergent samples, more comprehensive intermediates of the ATP turnover cycle are studied than in the available structures, providing crucial missing mechanistic details.

      Other major strengths of the study include high-quality DEER/PELDOR distance measurements in both detergent and proteoliposomes, the latter providing valuable dynamics information in the lipid environment. The proteoliposome study is crucial since the previous structural study (Li, Orlando & Liao 2019) was done in rather small-diameter nanodiscs, which might affect the overall dynamics of the complex. It would have been beneficial if the investigators had reconstituted the complex in lipid nanodiscs with the same composition as proteoliposomes. The mixed lipid/detergent micelles provide an alternative. It seems the ATPase activity of the protein complex is much lower in detergent compared with lipid nanodiscs (Li, Orlando & Liao 2019). It is unclear how ATPase activity in proteoliposomes compares to that in detergent micelles.

      Additionally, from previous structural studies and the mass spectrometry data presented here, LPS co-purifies and is already bound to the complex, thus the Apo state may represent the LPS-bound state without nucleotides.

    4. Reviewer #3 (Public Review):

      Summary:

      The manuscript by Dajka and co-workers reports the application of a biophysical approach to analyse the dynamics of the LptB2FG-C ABC transporter, involved in LPS transport across the cell envelope in Escherichia coli. LptB2FG-C belongs to a new class of ABC transporters (type VI) and is essential and conserved in several Gram-negative pathogens. Since LPS is the major component of the outer membrane of the Gram-negative cell and is responsible for the low permeability of this membrane to several antibiotics, a deep understanding of the mechanism and function of the LptB2FG-C transporter is crucial for the development of new drugs targeting Gram-negative pathogens.

      Several structural studies have been published so far on the LptB2FG-C transporter, disclosing important aspects of the transport mechanism; nevertheless, lack of resolution of some regions of the individual proteins as well as the dynamic nature of the transport mechanism per se (e.g. the insertion and removal of the TM helix of LptC from the TMDs of the transporter during the LPS transport cycle) has greatly limited the understanding of the mechanism that couples ATP binding and hydrolysis with LPS transport. This knowledge gap could be filled by applying an approach that allows the analysis of dynamic processes. The DEER/PELDOR technique applied in this work fits well with this requirement.

      Strengths:

      In this study the authors provide some new pieces of information on the LptB2FG-C function and the role of LptC in the transporter using a technique that allowed them to appreciate missing intermediate conformations adopted by the proteins during the transport cycle.

      The work is timely and well-conceived. The conclusions of the manuscript are supported by solid data and allow the authors to postulate a dynamic model for the mechanism of translocation of LPS across the inner membrane by the LptB2FGC complex.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1:

      Summary:

      The current manuscript uses electron spin resonance spectroscopy to understand how the dynamic behavior and conformational heterogeneity of the LPS transport system change during substrate transport and in response to the membrane, bound nucleotide (or transition state analog), and accessory subunits. The study builds on prior structural studies to expand our molecular understanding of this highly significant bacterial transport system. 

      Strengths 

      This series of well-designed and well-executed experiments provides new mechanistic insights into the dynamic behavior of the LPS transport system. Notable new insights provided by this study include its indication of the spatial organization of the LptC domain, which was poorly resolved in structures, and how the LptC domain modulates the dynamic behavior of the gate through which lipids access the binding site. In addition, a mass spectrometry approach designed to examine LPS binding at different stages in the nucleotide-dependent conformational cycle provides insight into the order of operations of LPS binding and transport. 

      We thank the reviewer for the very positive comments and highlighting the important findings from our study.

      Reviewer #2 (Public Review):

      Lipopolysaccharide (LPS) is a major component of the outer membrane of Gram-negative bacteria and plays a critical role in bacterial virulence. The LPS export mechanism is a potential target for new antibiotics. Inhibiting this process can render bacteria more susceptible to the host immune system or other antibacterial agents. Given the rise of antibiotic-resistant bacteria, novel targets are urgently needed. The seven LPS transport (Lpt) proteins, A-G, move LPS from the inner to the outer membrane. This study investigated the conformational changes in the LptB2FG-LptC complex using site-directed spin labeling (SDSL) electron paramagnetic resonance (EPR) spectroscopy, revealing how ATP binding and hydrolysis affect the LptF βjellyroll domain and lateral gates. The findings highlight the role of LptC in regulating LPS entry, ensuring efficient and unidirectional transport across the periplasm. 

      The β-jellyrolls are not fully resolved in the vanadate-trapped structure of LptB2FG and LptB2FGC. Therefore, the current study provides valuable information on the functional dynamics of these periplasmic domains, their interactions, and their roles in the unidirectional transport of LPS. Additionally, the dynamic perspective of the lateral gates in LptFG in the presence and absence of LptC is another strength of this study. Moreover, at least in detergent samples, more comprehensive intermediates of the ATP turnover cycle are studied than in the available structures, providing crucial missing mechanistic details. 

      We thank the reviewer for highlighting our major findings!

      Other major strengths of the study include high-quality DEER distance measurements in both detergent and proteoliposomes, the latter providing valuable dynamics information in the lipid environment. However, lipid composition is not mentioned. The proteoliposome study is crucial since the previous structural study (Li, Orlando & Liao 2019) was done in rather small-diameter nanodiscs, which might affect the overall dynamics of the complex. It would have been beneficial if the investigators had reconstituted the complex in lipid nanodiscs with the same composition as proteoliposomes. The mixed lipid/detergent micelles provide an alternative. It seems the ATPase activity of the protein complex is much lower in detergent compared with lipid nanodiscs (Li, Orlando & Liao 2019). In the current study, ATPase activity in proteoliposomes is not provided. Also, the reviewer assumes cysteine-less (CL) constructs of the complex components were utilized. The ATPase assay on CL complex is not presented. Additionally, from previous structural studies and the mass spectrometry data presented here, LPS co-purifies and is already bound to the complex, thus the Apo state may represent the LPS-bound state without nucleotides. 

      The liposomes are made from E. coli polar lipid extract, which we added to the Materials and Methods part now. We could not yet perform the investigations in nanodiscs, which is one of our aims for future. The ATPase activity is lower in micelles and the reviewer is correct in that we did not perform/compare ATPase activity in proteoliposomes. The data denoted as wild-type (WT, Figure S4) corresponds to the cysteine-less (CL) variant, which is now corrected in the supporting information. As the reviewer commented, the mass spectrometry data reveal bound LPS in the apo-state. However, as seen from our results, ADP-Mg2+ state is similar to the apo state, thus in the cellular environment LPS may bind to this state as well.

      The selection of sites to probe lateral gate 2, which forms the main LPS entry site, may pose an issue. Although the authors provide justification based on the available structures, one site (position 325 in LptF) is located on a flexible loop, and position 52 in LptG is on the neighboring transmembrane helix, separated by a potentially flexible loop from the gating TM1. These labeling sites could exhibit significant local dynamics, resulting in a broader distribution of distances and potentially masking the gating-related conformational changes. 

      Position 52 in LptG is located at the beginning of the neighboring transmembrane helix. As we have discussed in the manuscript, position 325 in LptF is located on a short loop connected to TM5. In the structures, this loop shows a very similar orientation (Figure S6). Further, the observed heterogeneity for the lateral gate-2 is considerably modulated into distinct conformation(s) upon LptC binding (Figure 6D-E). This would not be the case if this loop possesses any independent flexibility. Confirming these observations, the room temperature continuous wave ESR spectra revealed the least flexibility for this spin pair (Figure S5, S7). In view of the reasons and observations detailed above, we conclude that the local flexibility at the labelled sites might not make any significant contribution for the broad distribution observed at this gate in LptB2FG (Figure 4). 

      Reviewer #3 (Public Review):

      Summary: 

      The manuscript by Dajka and co-workers reports the application of a biophysical approach to analyse the dynamics of the LptB2FG-C ABC transporter, involved in LPS transport across the cell envelope in Escherichia coli. LptB2FG-C belongs to a new class of ABC transporters (type VI) and is essential and conserved in several Gram-negative pathogens. Since LPS is the major component of the outer membrane of the Gram-negative cell and is responsible for the low permeability of this membrane to several antibiotics, a deep understanding of the mechanism and function of the LptB2FG-C transporter is crucial for the development of new drugs targeting Gram-negative pathogens. 

      Several structural studies have been published so far on the LptB2FG-C transporter, disclosing important aspects of the transport mechanism; nevertheless, lack of resolution of some regions of the individual proteins as well as the dynamic nature of the transport mechanism per se (e.g. the insertion and removal of the TM helix of LptC from the TMDs of the transporter during the LPS transport cycle) has greatly limited the understanding of the mechanism that couples ATP binding and hydrolysis with LPS transport. This knowledge gap could be filled by applying an approach that allows the analysis of dynamic processes. The DEER/PELDOR technique applied in this work fits well with this requirement. 

      Strengths: 

      In this study, the authors provide some new pieces of information on the LptB2FG-C function and the role of LptC in the transporter. Notably, they show that: 

      - There is high heterogeneity in the conformational states of the entry gate of LPS in the transporter (gate-2) that are reduced by the insertion of LptC, and the heterogeneity observed is not altered by ATP binding or hydrolysis (as expected since LPS entry is ATP-independent). 

      - ATP binding induces an allosteric opening of LptF β-jellyroll domain that allows for LPS passage to the β-jellyroll of LptC, which is stably associated with the β-jellyroll of LptF throughout the cycle. 

      - The β-jellyroll of LptG is highly flexible, indicating an involvement in the LPS transport cycle. 

      The manuscript is timely and overall clear. 

      We thank the reviewer for the positive comments and highlighting our findings and the strength of DEER/PELDOR spectroscopy for characterizing the dynamics aspect of the LPS transport system.

      Weaknesses:

      I list my concerns below and provide suggestions that, in my opinion, should be addressed to reinforce the findings of this study. 

      (1) Protein complex controls: the authors assess the ATPase activity of the spin-labelled variants of their protein complexes to rule out the possibility that engineering the proteins to enable spin labelling could affect their functionality (Figure S4). It has been reported that the association of LptC to LptB2FG complex inhibits its ATPase activity. However, in the ATPase assay data shown in Figure S4, the inhibitory effect of the LptC TM is not visible (please compare LptB2FG F-A45C G-I335C and F-L325C G-A52C with and without LptC). This can lead to suspect that the regulatory function of LptC is missing in the LptC-containing complexes used in this work. I suggest the authors include wt LptB2FGC in the assay to compare the ATPase activity of this complex with wt LptB2FG. The published inhibitory effect of TM LptC has been observed in proteoliposomes. Since it is not clear from the paper if the ATPase assay in Figure 4 has been conducted in DDM or proteoliposomes, the lack of inhibitory effect could be due to the assay conditions. A comparative test could answer this question. 

      We could not observe the inhibitory effect of LptC on the ATPase activity of LptB2FG. As the reviewer pointed out, the primary reason is that we performed the assays in detergent micelles and not in proteoliposomes. For this reason, a comparison of the activity between (cysteine-less) LptB2FG and LptB2FG-C as the reviewer suggested would not be informative. As this information is not directly relevant for our current interpretations, we plan to perform those experiments in liposomes in the near future.

      (2) Figure 2: NBD closure upon ATP binding to LptB2FG is convincingly demonstrated both in DDM micelles and proteoliposomes, validating the experimental system. However, since under physiological conditions, ATP binding should take place before the displacement of the TM of LptC (Wilson and Ruiz, Mol microbiol 2022), I suggest the authors carry out the experiments with LptC-containing complexes to investigate conformational changes (if any) that are triggered when ATP binding occurs before the TM displacement.  

      We thank the reviewer for the suggestion. These experiments are in our to do list and would be performed in the near future.

      (3) Proteoliposomes: in the experiments shown in Figures 3 and 4, unlike those in Figure 2, measurements in proteoliposomes give different results from the experiments in DDM, showing higher heterogeneity. Could this be related to the presence (or absence) of LPS in liposomes? It is not mentioned in the materials and methods section whether LPS is present. Could the authors please discuss this? 

      We thank the reviewer for bringing out this interesting point. The liposomes are made from E. coli polar lipid extract. In the polar lipid extract, phosphatidylethanolamine (PE) is the predominant lipid component with minor amounts of phosphatidylglycerol (PG) and cardiolipin. Thus, the differences in the heterogeneity we observed in proteoliposomes might not be due to the presence of LPS. We added a short description on this aspect in the ‘Discussion’ part.

      (4) The authors show large conformational heterogeneity in gate-2 (using the spin-labelled pair F-L325R1-G-A52R1) and suggest that deviation from the corresponding simulations could be due to the need for enhanced dynamics to allow for gate interaction with LPS or LptC. The effect of LptC is probed in the experiments shown in Figure 6, but I suggest the authors add LPS to the complexes to evaluate the possible stabilizing effect of LPS on the conformations shown in Figure 4. 

      This indeed is an important experiment, which we plan to do in the near future.

      (5) Figure 6: the measurement of lateral gate 1 and 2 dynamics in the LptC-containing complexes clearly supports the hypothesis, proposed based on the available structures, that TM LptC dissociates from LptB2FG upon ATP binding. However, direct evidence of this movement is still missing. Would it be possible to monitor the dynamics of the TM LptC by directly labelling this protein domain? This would give a conclusive demonstration of the displacement during the ATPase cycle. 

      Yes, it should be possible to label LptC and monitor its position with respect to LptF or LptG. These experiments are in progress in our laboratory. 

      (6) LPS release assay: Figure 6 panels H-I-J show the MS spectra relative to LPS-bound and free proteins obtained from wt LptB2FG upon ATP binding and ATP hydrolysis conditions. From these spectra the authors conclude that LPS is completely released only upon ATP hydrolysis. However, the current model predicts that LPS release into the Lpt bridge made by LptC-A-D is triggered by ATP binding. For this reason, I suggest the authors assess LPS release also from the LptB2FGC complex where, in the absence of LptA, LPS would be expected to be mostly retained by the complex under the same conditions. 

      These indeed are exciting experiments. LPS binding and release by LptB2FGC is in progress in our laboratories.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Page 2 typo: apo-sate should be apo-state 

      Thank you! We corrected the typo.

      Can the authors clarify whether LPS is co-purified with the protein? Does it remain bound throughout the liposome reconstitution process? 

      Our mass spectrometry data show that LPS is co-purified with LptB2FG in micelles. However, we cannot yet verify the presence of bound LPS after reconstitution into proteoliposomes. We added a sentence in the last paragraph before Discussion as ‘Thus, LPS is co-purified with LptB2FG in micelles.’

      Reviewer #2 (Recommendations for The Authors): 

      Several points require clarification: 

      (1) The reviewer would have benefited from access to the raw DEER traces. For instance, in Figure 4, the change in the raw data appears subtle. The differences between the Apo and vanadate-trapped states in b-DDM might be related to a lower signal-to-noise ratio in the Apo state. 

      We would be happy to share the raw DEER data upon request. The analysis is performed with the primary data, which also takes into account of the noise level for the calculating the confidence interval. Therefore, the distances with the 95% confidence interval are reliable to the extent as they are presented.  

      (2) The panel labels in Figures 2-4 do not match the legends. 

      Thank you! We corrected them.

      (3) In Figure 2G, the authors state, "Overall, the ATP-induced closure as observed in micelles (and the structures) is maintained in the native-like lipid bilayers for the NBDs." This statement is technically incorrect since the vanadate-trapped state is not equivalent to the ATP+EDTA "ATP binding" state, which was not tested in proteoliposomes (PLS). The authors should have tested this condition for a few mutants in proteoliposomes. They should revise the manuscript to reflect this or provide evidence that the ATP+EDTA state is similar to the vanadate-trapped state in PLS. 

      We corrected the sentence as ‘Overall, the nucleotide-induced closure as observed in micelles (and the structures) is maintained in the native-like lipid bilayers for the NBDs.’

      (4) The mutant F-L325R1_G-A52R1 is not optimal for probing gate 2. Specifically, position 325 in LptF is highly flexible, as indicated by the very broad distance distributions in Figure 4, and may hinder probing the associated conformational changes in this gate. Comparing the cryo-EM structures of this loop under different conditions (Figure S6) does not provide solid evidence for the lack of flexibility. 

      Position 52 in LptG is located at the beginning of the neighboring transmembrane helix. As we have discussed in the manuscript, position 325 in LptF is located on a short loop connected to TM5. In the structures, this loop shows a very similar orientation (Figure S6). Further, the observed heterogeneity for the lateral gate-2 is considerably modulated into distinct conformation(s) upon LptC binding (Figure 6D-E). This would not be the case if this loop possesses any independent flexibility. Confirming these observations, the room temperature continuous wave ESR spectra revealed the least flexibility for this spin pair (Figure S5, S7). In view of the reasons and observations detailed above, we conclude that the local flexibility of the labelled sites might not make any significant contribution for the broad distribution observed at this gate in LptB2FG (Figure 4). 

      (5) Regarding Figure 4B, the authors state, "In the vanadate-trapped and ATP samples, the major population is centered at 2 nm (which corresponds to the simulation on the vanadate trapped structure)". While the shift to shorter distances aligns with the structures, the average distance from the simulation is around 3 nm and does not correspond closely to the DEER distances of 2 nm. 

      Thank you for noting this point. We corrected the sentence as ‘In the vanadate-trapped and ATP samples, the major population is centred at 2 nm (which is closer to the simulation on the vanadate-trapped structure).’

      (6) Regarding Figure 4D, the authors state, "Unlike the lateral gate-1 (and the NBDs), ADP-Mg2+ also induced a similar shift in the distance distribution." The reviewer believes that even without interaction with LptC, an equilibrium exists between two states in gate-2, and ATP binding or vanadate-trapping shifts the equilibrium to a shorter-distance population. Additionally, if the signal-to-noise ratio of the Apo state were similar to that of the ADP-Mg2+ state, similar distance distributions would have been observed for the Apo state. 

      We thank the reviewer for bringing out this excellent point. We thoroughly modified the corresponding section as ‘ADP-Mg2+ also gave a broad distribution comparable to the apo-state. Thus, in the apo-state this gate appears to exist in an equilibrium between the two conformations observed from the corresponding structures. ATP binding or vanadate-trapping shifts the equilibrium towards the collapsed conformation.’

      (7) Defining the conformational dynamics of the b-jellyroll domains is one of the major strengths of this study. The LptF and LptG b-jellyroll domains exhibit high flexibility in detergent micelles. Unfortunately, none of the experiments were repeated in proteoliposomes to determine if this flexibility persists in a lipid environment. 

      As it is conceivable, it is truly beyond the scope of the current study to repeat all the measurements in liposomes. Currently we are extending those investigations to liposomes and would be able to provide more insights in the near future.

      (8) Regarding Figure 6G, the authors claim, "Distances corresponding to the apo state are present possibly due to an incomplete vanadate trapping for this sample." It is unlikely that vanadate trapping would be incomplete for just one sample. A repeat experiment is recommended. 

      We will update on this point is due time.

      (9) Regarding the structural dynamics of the lateral gates, detergent micelles, and liposomes are vastly different environments. It is challenging to reach a consensus model based on data mostly derived from detergent micelles and only a few from proteoliposomes. 

      The observations in PLS are qualitatively similar to the micellar sample for the investigated positions (please see the first paragraph in “Discussion”). Further, our observations are in agreement with previous structural and biochemical data and further extent the mechanism in a coherent manner. 

      Reviewer #3 (Recommendations For The Authors):

      Minor comments 

      (1) Figure legends: There are several mismatches between panel nomenclature and the corresponding descriptions in the legends. Please check the correspondence between panel identification and descriptions throughout the manuscript (for example, F-G and H-J in Figure 2; and I and H in Figure 3). 

      Thank you! We corrected them.

      - Figure 6 legend: asterisk is in panel D and not C. 

      Corrected

      - Panels E and F are not mentioned. Moreover, the spectra for vanadate trapped conformation of LptF219-LptC104 have not been given a letter. 

      Corrected

      - A description of the different colors in the "Distance r" axis should be added to figure 2, 3, and 4 legends. 

      Corrected

      - Please indicate the meaning of the black arrows in figure legends. 

      Corrected

      (2) To improve data comprehension by the readers, the authors should indicate the relative spinlabelled pairs on the top of Figure 2, 3, and 4, as done for Figures 5 and 6. 

      Done

      (3) Reference 56 is cited incorrectly in the reference list and refers to a study employing reconstituted LptB2FG complexes rather than isolated β-jellyroll domains. 

      Corrected

      (4) Figure 3: How do the authors explain the evidence that ATP binding influences gate 1 conformational flexibility only in DDM micelles with respect of PLS? Is this something related to the release of LPS from the complex in different environments? 

      We do not know whether this difference is related to LPS release. Therefore, we generally interpreted as an effect of the membrane environment.

      (5) The initial sentence of the discussion looks somewhat incomplete, please correct it. 

      Done

      (6) To improve the readability of the paper, it could be useful to better focus the topic of the headings of the result paragraphs concerning the analysis of the individual lateral gates (for example, by indicating the name of the gate in the headings).

      Done

    1. eLife assessment

      This valuable study presents the development of a single turnover stopped-flow fluorescence experiment to study the kinetics of substrate unfolding and translocation by the bacterial ClpB disaggregase. Using non-physiological nucleotides to bypass the physiological regulation mechanism of ClpB, the authors convincingly show that the ClpB disaggregase is a processive motor with a slow unfolding step preceding rapid translocation. The results of this analysis are of value for future mechanistic studies on energy-dependent unfolding, degradation, and disaggregation molecular machines.

    2. Reviewer #1 (Public Review):

      In this study, the Authors used a stopped-flow method to investigate the kinetics of substrate translocation through the channel in hexameric ClpB, an ATP-dependent bacterial protein disaggregase. They engineered a series of polypeptides with the N-terminal RepA ClpB-targeting sequence followed by a variable number of folded titin domains. The Authors detected translocation of the substrate polypeptides by observing the enhancement of fluorescence from a probe located at the substrate's C-terminus. The total time of the substrates' translocation correlated with their lengths, which allowed the Authors to determine the number of residues translocated by ClpB per unit time.

      Strengths:

      This study confirms a previously proposed model of processive translocation of polypeptides through the channel in ClpB. The novelty of this work is in a clever design of a series of kinetic experiments with an engineered substrate that includes stably folded domains. This approach produced a quantitative description of the reaction rates and kinetic step sizes. Another valuable aspect is that the method can be used for other translocases from the AAA+ family to characterize their mechanism of substrate processing.

      Weaknesses:

      The main limitation of the study is in using a single non-physiological substrate of ClpB, which does not replicate physical properties of the aggregated cellular proteins and includes a non-physiological ClpB-targeting sequence. Another limitation is in the use of ATPgammaS to stimulate the substrate processing. It is not clear how relevant the results are to the ClpB function in living cells with ATP as the source of energy, a multitude of various aggregated substrates without targeting sequences that need ClpB's assistance, and in the presence of the co-chaperones.

      Evidence that ATPgammaS without ATP can provide sufficient energy for substrate translocation and unfolding is missing in the paper because the rate of phosphate release from ATPgammaS has not been determined. Thus, it is not clear if the observed translocation is linked to an actual chemical energy input or is a result of a diffusion-driven ratchet mediated by a substrate-trapping ClpB conformation obtained in the presence of ATPgammaS.

    3. Reviewer #2 (Public Review):

      Summary:

      The current work by Banwait et al. reports a fluorescence-based single turnover method based on protein-induced fluorescence enhancement (PIFE) to show that ClpB is a processive motor. The paper is a crucial finding as there has been ambiguity on whether ClpB is a processive or non-processive motor. Optical tweezers-based single-molecule studies have shown that ClpB is a processive motor, whereas previous studies from the same group hypothesized it to be a non-processive motor. As co-chaperones are needed for the motor activity of the ClpB, to isolate the activity of ClpB, they have used a 1:1 ratio ATP and ATPgS, where the enzyme is active even in the absence of its co-chaperones, as previously observed. A sequential mixing stop-flow protocol was developed, and the unfolding and translocation of RepA-TitinX, X = 1,2,3 repeats was monitored by measuring the fluorescence intensity with time of Alexa F555 that was labelled at the C-terminal Cysteine. The observations were a lag time, followed by a gradual increase in fluorescence due to PIFE, and then a decrease in fluorescence plausibly due to the dissociation from the substrate allowing it to refold. The authors observed that the peak time depends on the substrate length, indicating the processive nature of ClpB. In addition, the lag and peak times depend on the pre-incubation time with ATPgS, indicating that the enzyme translocates on the substrates even with just ATPgS without the addition of ATP, which is plausible due to the slow hydrolysis of ATPgS. From the plot of substrate length vs peak time, the authors calculated the rate of unfolding and translocation to be ~0.1 aas-1 in the presence of ~1 mM ATPgS and increases to 1 aas-1 in the presence of 1:1 ATP and ATPgS. The authors have further performed experiments at 3:1 ATP and ATPgS concentrations and observed ~5 times increase in the translocation rates as expected due to faster hydrolysis of ATP by ClpB and reconfirming that processivity is majorly ATP driven. Further, the authors model their results to multiple sequential unfolding steps, determining the rate of unfolding and the number of amino acids unfolded during each step. Overall, the study uses a novel method to reconfirm the processive nature of ClpB.

      Strengths:

      (1) Previous studies on understanding the processivity of ClpB have primarily focused on unfolded or disordered proteins; this study paves new insights into our understanding of the processing of folded proteins by ClpB. They have cleverly used RepA as a recognition sequence to understand the unfolding of titin-I27 folded domains.<br /> (2) The method developed can be applied to many disaggregating enzymes and has broader significance.<br /> (3) The data from various experiments are consistent with each other, indicating the reproducibility of the data. For example, the rate of translocation in presence of ATPgS, ~0.1 aas-1 from the single mixing experiment and double mixing experiment are very similar.<br /> (4) The study convincingly shows that ClpB is a processive motor, which has long been debated, describing its activity in the presence of only ATPgS and a mixture of ATP and ATPgS.<br /> (5) The discussion part has been written in a way that describes many previous experiments from various groups supporting the processive nature of the enzyme and supports their current study.

      Weaknesses:

      (1) The authors model that the enzyme unfolds the protein sequentially around 60 aa each time through multiple steps and translocates rapidly. This contradicts our knowledge of protein unfolding, which is generally cooperative, particularly for titinI27, which is reported to unfold cooperatively or utmost through one intermediate during enzymatic unfolding by ClpX and ClpA.<br /> (2) It is also important to note that the unfolding of titinI27 from the N-terminus (as done in this study) has been reported to be very fast and cannot be the rate-limiting step as reported earlier(Olivares et al, PNAS, 2017). This contradicts the current model where unfolding is the rate-limiting step, and the translocation is assumed to be many orders faster than unfolding.<br /> (3) The model assumes the same time constant for all the unfolding steps irrespective of the secondary structural interactions.<br /> (4) Unlike other single-molecule optical tweezer-based assays, the study cannot distinguish the unfolding and translocation events and assumes that unfolding is the rate-limiting step.

    4. Reviewer #3 (Public Review):

      Summary:

      The authors have devised an elegant stopped flow fluorescence approach to probe the mechanism of action of the Hsp100 protein unfoldase ClpB on an unfolded substrate (RepA) coupled to 1-3 repeats of a folded titin domain. They provide useful new insight into the kinetics of ClpB action. The results support their conclusions for the model setup used.

      Strengths:

      The stopped flow fluorescence method with a variable delay after mixing the reactants is informative, as is the use of variable numbers of folded domains to probe the unfolding steps.

      Weaknesses:

      The setup does not reflect the physiological setting for ClpB action. A mixture of ATP and ATPgammaS is used to activate ClpB without the need for its co-chaperones, Hsp70. Hsp40 and an Hsp70 nucleotide exchange factor. This nucleotide strategy was discovered by Doyle et al (2007) but the mechanism of action is not fully understood. Other authors have used different approaches. As mentioned by the authors, Weibezahn et al used a construct coupled to the ClpA protease to demonstrate translocation. Avellaneda et al used a mutant (Y503D) in the coiled coil regulatory domain to bypass the Hsp70 system. These differences complicate comparisons of rates and step sizes with previous work. It is unclear which results, if any, reflect the in vivo action of ClpB on disassembly of aggregates.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      In this study, the authors used a stopped-flow method to investigate the kinetics of substrate translocation through the channel in hexameric ClpB, an ATP-dependent bacterial protein disaggregase. They engineered a series of polypeptides with the N-terminal RepA ClpB-targeting sequence followed by a variable number of folded titin domains. The authors detected translocation of the substrate polypeptides by observing the enhancement of fluorescence from a probe located at the substrate's C-terminus. The total time of the substrates' translocation correlated with their lengths, which allowed the authors to determine the number of residues translocated by ClpB per unit time.

      Strengths:

      This study confirms a previously proposed model of processive translocation of polypeptides through the channel in ClpB. The novelty of this work is in the clever design of a series of kinetic experiments with an engineered substrate that includes stably folded domains. This approach produced a quantitative description of the reaction rates and kinetic step sizes. Another valuable aspect is that the method can be used for other translocases from the AAA+ family to characterize their mechanism of substrate processing.

      Weaknesses:

      The main limitation of the study is in using a single non-physiological substrate of ClpB, which does not replicate the physical properties of the aggregated cellular proteins and includes a non-physiological ClpB-targeting sequence. Another limitation is in the use of ATPgammaS to stimulate the substrate processing. It is not clear how relevant the results are to the ClpB function in living cells with ATP as the source of energy, a multitude of various aggregated substrates without targeting sequences that need ClpB's assistance, and in the presence of the co-chaperones.

      Indeed, we agree that our RepA-Titinx substrates are not aggregates but are model, soluble, substrates used to reveal information about enzyme catalyzed protein unfolding and translocation.  Our substrates are similar to RepA-GFP and GFP-SsrA used by multiple labs including Wickner, Horwich, Sauer, Baker, Shorter, Bukua, to name only a few.  The fact that “this is what everyone does” does not make the substrates physiological or the most ideal. However, this is the technology we currently have until we and others develop something better. In the meantime, we contend that  the results presented here do advance our knowledge on enzyme catalyzed protein unfolding

      Part of what this manuscript seeks to accomplish is presenting the development of a single-turnover experiment that reports on processive protein unfolding by AAA+ molecular motors, in this case, ClpB.  Importantly, we are treating translocation on an unfolded polypeptide chain and protein unfolding of stably folded proteins as two distinct reactions catalyzed by ClpB. If these functions are used to disrupt protein aggregates, in vivo, then this remains to be seen.

      We contend that processive ClpB catalyzed protein unfolding has not been rigorously demonstrated prior to our results presented here.  Avellaneda et al mechanically unfolded their substrate before loading ClpB (Avellaneda, Franke, Sunderlikova et al. 2020).  Thus, their experiment represents valuable observations reflecting polypeptide translocation on a pre-unfolded protein.  Our previous work using single-turnover stopped-flow experiments employed unstructured synthetic polypeptides and therefore reflects polypeptide translocation and not protein unfolding (Li, Weaver, Lin et al. 2015).  Weibezahn et al used unstructured substrates in their study with ClpB (BAP/ClpP), and thus their results represent translocation of a pre-unfolded polypeptide and not enzyme catalyzed protein unfolding (Weibezahn, Tessarz, Schlieker et al. 2004). 

      Many studies have reported the use of  GFP with tags or RepA-GFP and used the loss of GFP fluorescence to conclude protein unfolding.  However, such results do not reveal if ClpB processively and fully translocates the substrate through its axial channel.  One cannot rule out, even when trapping with “GroEL trap”, the possibility that ClpB only needs to disrupt some of the fold in GFP before cooperative unfolding occurs leading to loss of fluorescence.  Once the cooperative collapse of the structure occurs and fluorescence is lost it has not been shown that ClpB will continue to translocate on the newly unfolded chain or dissociate. In fact, the Bukau group showed that folded YFP remained intact after luciferase was unfolded (Haslberger, Zdanowicz, Brand et al. 2008).  Our approach, reported here, yields signal upon arrival of the motor at the c-terminus or within the PIFE distance thus we can be certain that the motor does arrive at the c-terminus after unfolding up to three tandem repeats of the Titin I27 domain.

      ATPgS is a non-physiological nucleotide analog.  However, ClpB has been shown to exhibit curious behavior in its presence that we and others, as the reviewer acknowledges, do not fully understand (Doyle, Shorter, Zolkiewski et al. 2007).  Some of the experiments reported here are seeking to better understand that fact.  Here we have shown that ATPgS alone will support processive protein unfolding. With this assay in hand, we are now seeking to go forward and address many of the points raised by this reviewer. 

      The authors do not attempt to correlate the kinetic step sizes detected during substrate translocation and unfolding with the substrate's structure, which should be possible, given how extensively the stability and unfolding of the titin I27 domain were studied before. Also, since the substrate contains up to three I27 domains separated with unstructured linkers, it is not clear why all the translocation steps are assumed to occur with the same rate constant.

      We assume that all protein unfolding steps occur with the same rate constant, ku.  We conclude that we are not detecting the translocation rate constant, kt, as our results support a model where kt is much faster than ku.  We do think it makes sense that the same slow step occurs between each cycle of protein unfolding.

      We have added a discussion relating our observations to mechanical unfolding of tandem repeats of Titin I27 from AFM experiments  (Oberhauser, Hansma, Carrion-Vazquez and Fernandez 2001). Most interestingly, they report unfolding of Titin I27 in 22 nm steps.  Using 0.34 nm per amino acids this yields ~65 amino acids per unfolding step, which is comparable to our kinetic step-size of 57 – 58 amino acids per step.

      Some conclusions presented in the manuscript are speculative:

      The notion that the emission from Alexa Fluor 555 is enhanced when ClpB approaches the substrate's C-terminus needs to be supported experimentally. Also, evidence that ATPgammaS without ATP can provide sufficient energy for substrate translocation and unfolding is missing in the paper.

      In our previous work we have used fluorescently labeled 50 amino acid peptides as substrates to examine ClpB binding (Li, Lin and Lucius 2015, Li, Weaver, Lin et al. 2015).  In that work we have used fluorescein, which exhibits quenching upon ClpB binding.  We have added a control experiment where we have attached alexa fluor 555 to the 50 amino acid substrate so we can be assured the ClpB binds close to the fluorophore.  As seen in supplemental Fig. 1 A  upon titration with ClpB, in the presence of ATPγS, we observe an increase in fluorescence from AF555, consistent with PIFE.  Supplemental Fig. 1 B shows the relative fluorescence enhancement at the peak max increases up to ~ 0.2 or a 20 % increase in fluorescence, due to PIFE, upon ClpB binding.   

      Further, peak time is our hypothesized measure of ClpB’s arrival at the dye. Our results indicate that the peak time linearly increases as a function of an increase in the number of folded TitinI27 repeats in the substrates which also supports the PIFE hypothesis. Finally, others have shown that AF555 exhibits PIFE and we have added those references.

      The evidence that ATPγS alone can support translocation is shown in Fig. 2 and supplemental Figure 1.  Fig. 2 and supplemental Figure 1 are two different mixing strategies where we use only ATPgS and no ATP at all.  In both cases the time courses are consistent with processive protein unfolding by ClpB with only ATPγS.

      Reviewer #2 (Public Review):

      Summary:

      The current work by Banwait et al. reports a fluorescence-based single turnover method based on protein-induced fluorescence enhancement (PIFE) to show that ClpB is a processive motor. The paper is a crucial finding as there has been ambiguity on whether ClpB is a processive or non-processive motor. Optical tweezers-based single-molecule studies have shown that ClpB is a processive motor, whereas previous studies from the same group hypothesized it to be a non-processive motor. As co-chaperones are needed for the motor activity of the ClpB, to isolate the activity of ClpB, they have used a 1:1 ratio ATP and ATPgS, where the enzyme is active even in the absence of its co-chaperones, as previously observed. A sequential mixing stop-flow protocol was developed, and the unfolding and translocation of RepA-TitinX, X = 1,2,3 repeats was monitored by measuring the fluorescence intensity with the time of Alexa F555 which was labelled at the C-terminal Cysteine. The observations were a lag time, followed by a gradual increase in fluorescence due to PIFE, and then a decrease in fluorescence plausibly due to the dissociation from the substrate allowing it to refold. The authors observed that the peak time depends on the substrate length, indicating the processive nature of ClpB. In addition, the lag and peak times depend on the pre-incubation time with ATPgS, indicating that the enzyme translocates on the substrates even with just ATPgS without the addition of ATP, which is plausible due to the slow hydrolysis of ATPgS. From the plot of substrate length vs peak time, the authors calculated the rate of unfolding and translocation to be ~0.1 aas-1 in the presence of ~1 mM ATPgS and increases to 1 aas-1 in the presence of 1:1 ATP and ATPgS. The authors have further performed experiments at 3:1 ATP and ATPgS concentrations and observed ~5 times increase in the translocation rates as expected due to faster hydrolysis of ATP by ClpB and reconfirming that processivity is majorly ATP driven. Further, the authors model their results to multiple sequential unfolding steps, determining the rate of unfolding and the number of amino acids unfolded during each step. Overall, the study uses a novel method to reconfirm the processive nature of ClpB.

      Strengths:

      (1) Previous studies on understanding the processivity of ClpB have primarily focused on unfolded or disordered proteins; this study paves new insights into our understanding of the processing of folded proteins by ClpB. They have cleverly used RepA as a recognition sequence to understand the unfolding of titin-I27 folded domains.

      (2) The method developed can be applied to many disaggregating enzymes and has broader significance.

      (3) The data from various experiments are consistent with each other, indicating the reproducibility of the data. For example, the rate of translocation in the presence of ATPgS, ~0.1 aas-1 from the single mixing experiment and double mixing experiment are very similar.

      (4) The study convincingly shows that ClpB is a processive motor, which has long been debated, describing its activity in the presence of only ATPgS and a mixture of ATP and ATPgS.

      (5) The discussion part has been written in a way that describes many previous experiments from various groups supporting the processive nature of the enzyme and supports their current study.

      Weaknesses:

      (1) The authors model that the enzyme unfolds the protein sequentially around 60 aa each time through multiple steps and translocates rapidly. This contradicts our knowledge of protein unfolding, which is generally cooperative, particularly for titinI27, which is reported to unfold cooperatively or utmost through one intermediate during enzymatic unfolding by ClpX and ClpA.

      We do not think this represents a contradiction.  In fact, our observations are in good agreement with mechanical unfolding of tandem repeats of Titin I27 using AFM experiments (Oberhauser, Hansma, Carrion-Vazquez and Fernandez 2001).  They showed that tandem repeats of TitinI27 unfolded in steps of ~22 nm.  Dividing 22 nm by 0.34 nm/Amino Acid gives ~65 amino acids per unfolding event.  This implies that, under force, ~65 amino acids of folded structure unfolds in a single step.  This number is in excellent agreement with our kinetic step-size of 65 AA/step. 

      Importantly, the experiments cited by the reviewer on ClpA and ClpX are actually with ClpAP and ClpXP.  We assert that this is an important distinction as we have shown that ClpA employs a different mechanism than ClpAP (Rajendar and Lucius 2010, Miller, Lin, Li and Lucius 2013, Miller and Lucius 2014).  Thus, ClpA and ClpAP should be treated as different enzymes but, without question, ClpB and ClpA are different enzymes.

      (2) It is also important to note that the unfolding of titinI27 from the N-terminus (as done in this study) has been reported to be very fast and cannot be the rate-limiting step as reported earlier(Olivares et al, PNAS, 2017). This contradicts the current model where unfolding is the rate-limiting step, and the translocation is assumed to be many orders faster than unfolding.

      Most importantly, the Olivares paper is examining ClpXP and ClpAP catalyzed protein unfolding and translocation and not ClpB.  These are different enzymes.  Additionally, we have shown that ClpAP and ClpA translocate unfolded polypeptides with different rates, rate constants, and kinetic step-sizes indicating that ClpP allosterically impacts the mechanism employed by ClpA to the extent that even ClpA and ClpAP should be considered different enzymes (Rajendar and Lucius 2010, Miller, Lin, Li and Lucius 2013).  We would further assert that there is no reason to assume ClpAP and ClpXP would catalyze protein unfolding using the same mechanism as ClpB as we do not think it should be assumed ClpA and ClpX use the same mechanism as ClpAP and ClpXP, respectively. 

      The Olivares et al paper reports a dwell time preceding protein unfolding of ~0.9 and ~0.8 s for ClpXP and ClpAP, respectively.   The inverse of this can be taken as the rate constant for protein unfolding and would yield a rate constant of ~1.2 s-1, which is in good agreement with our observed rate constant of 0.9 – 4.3 s-1 depending on the ATP:ATPγS mixing ratio.  For ClpB, we propose that the slow unfolding is then followed by rapid translocation on the unfolded chain where translocation by ClpB must be much faster than for ClpAP and ClpXP.  We think this is a reasonable interpretation of our results and not a contradiction of the results in Olivares et al. Moreover, this is completely consistent with the mechanistic differences that we have reported, using the same single-turnover stopped flow approach on the same unfolded polypeptide chains with ClpB, ClpA, and ClpAP (Rajendar and Lucius 2010, Miller, Lin, Li and Lucius 2013, Miller and Lucius 2014, Li, Weaver, Lin et al. 2015).

      (3) The model assumes the same time constant for all the unfolding steps irrespective of the secondary structural interactions.

      Yes, we contend that this is a good assumption because it represents repetition of protein unfolding catalyzed by ClpB upon encountering the same repeating structural elements, i.e. Beta sheets. 

      (4) Unlike other single-molecule optical tweezer-based assays, the study cannot distinguish the unfolding and translocation events and assumes that unfolding is the rate-limiting step.

      Although we cannot, directly, distinguish between protein unfolding and translocation we have logically concluded that protein unfolding is likely rate limiting. This is because the large kinetic step-size represents the collapse of ~60 amino acids of structure between two rate-limiting steps, which we interpret to represent cooperative protein unfolding induced by ClpB.  It is not an assumption it is our current best interpretation of the observations that we are now seeking to further test. 

      Reviewer #3 (Public Review):

      Summary:

      The authors have devised an elegant stopped-flow fluorescence approach to probe the mechanism of action of the Hsp100 protein unfoldase ClpB on an unfolded substrate (RepA) coupled to 1-3 repeats of a folded titin domain. They provide useful new insight into the kinetics of ClpB action. The results support their conclusions for the model setup used.

      Strengths:

      The stopped-flow fluorescence method with a variable delay after mixing the reactants is informative, as is the use of variable numbers of folded domains to probe the unfolding steps.

      Weaknesses:

      The setup does not reflect the physiological setting for ClpB action. A mixture of ATP and ATPgammaS is used to activate ClpB without the need for its co-chaperones, Hsp70. Hsp40 and an Hsp70 nucleotide exchange factor. This nucleotide strategy was discovered by Doyle et al (2007) but the mechanism of action is not fully understood. Other authors have used different approaches. As mentioned by the authors, Weibezahn et al used a construct coupled to the ClpA protease to demonstrate translocation. Avellaneda et al used a mutant (Y503D) in the coiled-coil regulatory domain to bypass the Hsp70 system. These differences complicate comparisons of rates and step sizes with previous work. It is unclear which results, if any, reflect the in vivo action of ClpB on the disassembly of aggregates.

      We agree with the reviewer, there are several strategies that have been employed to bypass the need for Hsp70/40 or KJE to simplify in vitro experiments.  Here we have developed a first of its kind transient state kinetics approach that can be used to examine processive protein unfolding.  We now seek to go forward with examining the mechanisms of hyperactive mutants, like Y503D, and add the co-chaperones so that we can address the limitations articulated by the reviewer.   In fact we already began adding DnaK to the reaction and found that DnaK induced ClpB to release the polypeptide chain (Durie, Duran and Lucius 2018).  However, the sequential mixing strategy developed here was needed to go forward with examining the impact of co-chaperones. 

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Line 1: I recommend changing the title of the paper to remove the terms that are not clearly defined in the text: "robust" and "processive". What are the Authors' criteria for describing a molecular machine as "robust" vs. "not robust"? A definition of processivity is given in equation 2, but its value for ClpB is not reported in the text, and the criteria for classifying a machine as "processive" vs. "non-processive" are not included. Besides, the Authors have previously reported that ClpB is non-processive (Biochem. J., 2015), so it is now clear that a more nuanced terminology should be applied to this protein. Also, Escherichia coli should be fully spelled out in the title.

      The title has been changed.  We have removed “robust” as we agree with the reviewer, there is no way to quantify “robust”.  However, we have kept “processive” and have added to the discussion a calculation of processivity since we can quantify processivity.  Importantly, the unstructured substrates used in our previous studies represent translocation and not protein unfolding.  here, on folded substrates, we detect rate-limiting protein unfolding followed by rapid translocation.  Thus, we report a lower bound on protein unfolding processivity of 362 amino acids. 

      Line 20: The comment about mitochondrial SKD3 should be removed. SKD3, like ClpB, belongs to the AAA+ family, and it is simply a coincidence that the original study that discovered SKD3 termed it an Hsp100 homolog. The similarity between SKD3 and ClpB is limited to the AAA+ module, so there are many other metazoan ATPases, besides SKD3, that could be called homologs of ClpB, including mitochondrial ClpX, ER-localized torsins, p97, etc.

      Removed.

      Lines 133-139. Contrary to what the authors state, it is not clear that the "lag-phase" becomes significantly shorter for subsequent mixing experiments (Figure 1E) perhaps except for the last one (2070 s). It is clear, however, that the emission enhancement becomes stronger for later mixes. This effect should be discussed and explained, as it suggests that the pre-equilibrations shorter than ~2000 sec do not produce saturation of ClpB binding to the substrate.

      We have added supplemental figure 2, which represents a zoom into the lag region.  This better illustrates what we were seeing but did not clearly show to the reader.  In addition, we address all three changes in the time courses, i.e. extend of lag, change in peak position, and the change in peak height. 

      Line 175. The hydrolysis rate of ATPgammaS in the presence of ClpB should be measured and compared to the hydrolysis rate with ATP/ATPgammaS to check if the ratio of those rates agrees with the ratio of the translocation rates. These experiments should be performed with and without the RepA-titin substrate, which could reveal an important linkage between the ATPase engine and substrate translocation. These experiments are essential to support the claim of substrate translocation and unfolding with ATPgammaS as the sole energy source.

      The time courses shown in figure 2 and supplemental Figure 1 are collected with only ATPgS and no ATP.  The time courses show a clear increase in lag and appearance of a peak with increasing number of tandem repeats of titin domains.  We do not see an alternate explanation for this observation other than ATPγS supports ClpB catalyzed protein unfolding and translocation.  What is the reviewers alternate explanation for these observations?

      We agree with the reviewer that the linkage of ATP hydrolysis to protein unfolding and translocation is essential and we are seeking to acquire this knowledge.  However, a simple comparison of the ratio of rates is not adequate. We contend that a complete mechanistic study of ATP turnover by ClpB is required to properly address this linkage and such a study is too substantial to be included here but is currently underway. 

      All that said, the statement on line 175 was removed since we do not report any ATPase measurements in this paper.

      Line 199: It is an over-simplification to state that "1:1 mix of ATP to ATPgammaS replaces the need for co-chaperones". This sentence should be corrected or removed. The ClpB co-chaperones (DnaK, DnaJ, GrpE) play a major role in targeting ClpB to its aggregated substrates in cells and in regulating the ClpB activity through interactions with its middle domain. ATPgammaS does not replace the co-chaperones; it is a chemical probe that modifies the mechanism of ClpB in a way that is not entirely understood.

      We agree with the reviewer.  The sentence has been modified to point out that the mix of ATP and ATPγS activates ClpB.

      Figure 3B, Supplementary Figure 5A. The solid lines from the model fit cannot be distinguished from the data points. Please modify the figures' format to clearly show the fits and the data points.

      Done.

      Lines 326, 329. It is not clear why the authors mention a lack of covalent modification of substrates by ClpB. AAA+ ATPases do not produce covalent modifications of their substrates.

      The issue of covalent modification was presented in the introduction lines 55 – 60 pointing out that much of what we have learned about protein unfolding and translocation catalyzed by ClpA and ClpX is from the observations of proteolytic degradation catalyzed by the associated protease ClpP.  However, this approach is not possible for ClpB/Hsp104 as these motors do not associate with a protease unless they have been artificially engineered to do so. 

      Lines 396-399. I am puzzled why the authors try to correlate the size of the detected kinetic step with the length of the ClpB channel instead of the size characteristics of the substrate.

      We are attempting to discuss/rationalize the observed large kinetic step-size which, in part, is defined by the structural properties of the enzyme as well as the size characteristics of the substrate.  We have attempted to clarify this and better discuss the properties of the substrate as well as ClpB.

      As I mentioned in the Public Review, it is essential to demonstrate that the emission increase used as the only readout of the ClpB position along the substrate is indeed caused by the proximity of ClpB to the fluorophore. One way to accomplish that would be to place the fluorophore upstream from the first I27 domain and determine if the "lag phase" in the emission enhancement disappears.

      Alexa Fluor 555 is well established to exhibit PIFE.  However, as in the response to the public review, we have included an appropriate control showing this in supplemental Fig. 1.

      Finally, the authors repetitively place their results in opposition to the study of Weibezahn et al. published in 2004 which first demonstrated substrate translocation by engineering a peptidase-associated variant of ClpB. It should be noted that the field of protein disaggregases has moved since the time of that publication from the initial "from-start-to-end" translocation model to a more nuanced picture of partial translocation of polypeptide loops with possible substrate slipping through the ClpB channel and a dynamic assembly of ClpB hexamers with possible subunit exchange, all of which may affect the kinetics in a complex way. However, the present study confirmed the "start-to-end" translocation model, albeit for a non-physiological ClpB substrate, and that is the take-home message, which should be included in the text.

      It is not clear to us that the field has “moved on” since Weibezahn et al 2004.  Their engineered construct that they term “BAP” with ClpP is still used in the field despite us reporting that proteolytic degradation is observed in the absence of ATP with that system  (Li, Weaver, Lin et al. 2015) and should, therefore, not be used to conclude processive energy driven translocation. The “partial translocation” by ClpB is also grounded in observations of partial degradation catalyzed by ClpP with BAP from the same group (Haslberger, Zdanowicz, Brand et al. 2008). It is not clear to us that the idea of subunit exchange leading to the possibility of assembly around internal sequences is being considered.  We do agree that this is an important mechanistic possibility that needs further interrogation. We agree with the reviewer, all these factors are confounding and lead to a more nuanced view of the mechanism.

      All that said, we have removed some of the opposition in the discussion.

      Reviewer #2 (Recommendations For The Authors):

      (1) It is assumed that the lag phase will be much longer than the phase in which we see a gradual increase in fluorescence, as the effect of PIFE is significant only when the enzyme is very close to the fluorophore. Particularly for RepA-titin3, the enzyme has to translocate many tens of nm before it is closer to the C-terminus fluorophore. However, in all cases, the lag time is lower or similar to the gradual increase phase (for example, Figure 3B). Could the authors explain this?

      The extent of the lag, or time zero until the signal starts to increase, is interpreted to indicate the time the motor moves from it’s initial binding site until it gets close enough to the fluorophore that PIFE starts to occur.  In our analysis we apply signal change to the last intermediate and dissociation or release of unfolded RepA-TitinX.  The increase in PIFE is not “all or nothing”.  Rather, it is starting to increase gradually.  Further, because these are ensemble measurements, and each molecule will exhibit variability in rate there is increased breadth of the peak due to ensemble averaging. 

      (2) Although the reason for differences in the peak position (for example, Figure 1E, 2B) is apparent, the reason for variations in the relative intensities has to be given or speculated.

      We have addressed the reason for the different peak heights in the revised manuscript.  It is the consequence of the fact that each substrate has slightly different fluorescent labeling efficiencies.  Thus, for each sample there is a mix of labeled and unlabeled substrates both of which will bind to ClpB but the unlabeled ClpB bound substrates do not contribute to the fluorescence signal, but will represent a binding competitor.  Thus, for low labeling efficiency there is a lower concentration of ClpB bound to fluorescent RepA-Titinx and for higher labeling efficiency there is higher concentration of ClpB bound to RepA-Titinx leading to an increased peak height.  RepA-Titin2 has the highest labeling efficiency and thus the largest peak height.

      Reviewer #3 (Recommendations For The Authors):

      The authors should make it clear that they and previous authors have used different constructs or conditions to bypass the physiological regulation of ClpB action by Hsp70 and its co-factors as mentioned above. In particular, the construct used by Avellaneda et al should be explained when they challenge the findings of those authors.

      Minor points:

      The lines fitting the experimental points are difficult or impossible to see in Figures 2B, 3B, and s5B.

      Fixed

      Typo bottom of p6 - "averge"

      Fixed

      Avellaneda, M. J., K. B. Franke, V. Sunderlikova, B. Bukau, A. Mogk and S. J. Tans (2020). "Processive extrusion of polypeptide loops by a Hsp100 disaggregase." Nature.

      Doyle, S. M., J. Shorter, M. Zolkiewski, J. R. Hoskins, S. Lindquist and S. Wickner (2007). "Asymmetric deceleration of ClpB or Hsp104 ATPase activity unleashes protein-remodeling activity." Nature structural & molecular biology 14(2): 114-122.

      Durie, C. L., E. C. Duran and A. L. Lucius (2018). "Escherichia coli DnaK Allosterically Modulates ClpB between High- and Low-Peptide Affinity States." Biochemistry 57(26): 3665-3675.

      Haslberger, T., A. Zdanowicz, I. Brand, J. Kirstein, K. Turgay, A. Mogk and B. Bukau (2008). "Protein disaggregation by the AAA+ chaperone ClpB involves partial threading of looped polypeptide segments." Nat Struct Mol Biol 15(6): 641-650.

      Li, T., J. Lin and A. L. Lucius (2015). "Examination of polypeptide substrate specificity for Escherichia coli ClpB." Proteins 83(1): 117-134.

      Li, T., C. L. Weaver, J. Lin, E. C. Duran, J. M. Miller and A. L. Lucius (2015). "Escherichia coli ClpB is a non-processive polypeptide translocase." Biochem J 470(1): 39-52.

      Miller, J. M., J. Lin, T. Li and A. L. Lucius (2013). "E. coli ClpA Catalyzed Polypeptide Translocation is Allosterically Controlled by the Protease ClpP." Journal of Molecular Biology 425(15): 2795-2812.

      Miller, J. M. and A. L. Lucius (2014). "ATP-gamma-S Competes with ATP for Binding at Domain 1 but not Domain 2 during ClpA Catalyzed Polypeptide Translocation." Biophys Chem 185: 58-69.

      Oberhauser, A. F., P. K. Hansma, M. Carrion-Vazquez and J. M. Fernandez (2001). "Stepwise unfolding of titin under force-clamp atomic force microscopy." Proc Natl Acad Sci U S A 98(2): 468-472.

      Rajendar, B. and A. L. Lucius (2010). "Molecular mechanism of polypeptide translocation catalyzed by the Escherichia coli ClpA protein translocase." J Mol Biol 399(5): 665-679.

      Weibezahn, J., P. Tessarz, C. Schlieker, R. Zahn, Z. Maglica, S. Lee, H. Zentgraf, E. U. Weber-Ban, D. A. Dougan, F. T. Tsai, A. Mogk and B. Bukau (2004). "Thermotolerance requires refolding of aggregated proteins by substrate translocation through the central pore of ClpB." Cell 119(5): 653-665.

    1. eLife assessment

      This useful study describes a role for acetylation in controlling the stability of acetyl-CoA synthetase 2, which converts acetate to acetyl-CoA for de novo lipid synthesis. While many aspects of the study are solid, some evidence supporting these findings is incomplete. Including direct demonstration of target deacetylation by sirtuin 2, revisiting statistical analyses, and confirming generalizability to adipocyte cell lines would further strengthen the study. This work will be of interest to researchers studying lipid metabolism and related diseases.

    2. Reviewer #1 (Public Review):

      Summary:

      In this manuscript, the authors delineate the crucial role of the SIRT2-ACSS2 axis in ACSS2 degradation. They demonstrate that SIRT2 acts as an ACSS2 deacetylase specifically under nutrient stress conditions, notably during amino acid deficiency. The SIRT2-mediated deacetylation of ACSS2 at K271 consequently triggers its proteasomal degradation. Additionally, they illustrate that acetylation of ACSS2 at K271 enhances ACSS2 protein levels, thereby promoting De Novo lipogenesis.

      Strengths:

      The findings presented in this manuscript are clearly interesting.

      Weaknesses:

      Further support is required for the model put forward by the authors.

    3. Reviewer #2 (Public Review):

      Summary:

      Karim et al investigated the regulation of ACSS2 by SIRT2. The authors identified a previously undescribed acetylation that they then show is important for the regulation and stability of ACSS2 in cells. The authors show that ACSS2 ubiquitination and degradation by the proteasome is regulated by SIRT2-mediated deacetylation of ACSS2 and that stabilizing ACSS2 by blocking SIRT2 can alter lipid accumulation in adipocytes.

      Strengths:

      Identification of a novel acetylation site on ACSS2 that regulates its protein stability and that has consequences on its activity in adipocytes. Multiple standard approaches were used to manipulate the expression and function of SIRT2 and ACSS2 (i.e., overexpression, knockdown, inhibitors).

      Weaknesses:

      Throughout the manuscript, normalizing the data to 1 and then comparing the fold-change using a t-test is not the best statistical approach in that situation since every normalized value for control is 1 with zero standard deviation. The authors should consider an alternative statistical approach.

      Though not necessary, using 13C-acetate or D3-acetate tracing would be better for understanding the impact of acetylation on the activity of ACSS2 and its impact on lipogenesis.

    4. Reviewer #3 (Public Review):

      Summary:

      Manuscript shows SIRT2 can regulate acetylation of ACSS2 at residue 271, acetylation of 271 protects ACSS2 from proteasomal degradation in a SIRT2-dependent manner. Lastly authors show that ACSS2 acetylation at K271 promotes lipid accumulation.

      Strengths:

      Author provide solid data showing ACSS2 acetylation can be regulated by targeting SIRT2 and that SIRT2 regulates ACSS2 ubiquitination. They identify K271 as a site of acetylation and show this is a site when mutated alters SIRT2-mediated ubiquitination.

      Weaknesses:

      However, data for this manuscript seems preliminary as nearly all data is performed in one cell line, some of the conclusions not well supported by data and overall role of ACSS2 K271 acetylation is not well characterized.

    5. Author response:

      The following is the authors’ response to the original reviews.

      We would like to thank the reviewers and editor for their helpful comments. We have addressed their concerns as detailed below.

      It would have been nice to have included a bona-fide SIRT2 target as a control throughout the study.

      We agree that including a bona-fide SIRT2 target as a control is important for validating our results. Previous data from our work has shown that SIRT2 demyristoylates ARF6. Thus, we have included a blot in Figure S15 demonstrating that SIRT2 knockdown results in increased myristoylation of ARF6. This serves as a control to confirm the activity and role of SIRT2 in our study.

      Did the authors also consider investigating SIRT1 in their assays? SIRT1 activates ACSS2 while SIRT2 leads to degradation of ACSS2. They should at least discuss these seemingly opposing roles of SIRT1 and SIRT2 in the regulation of ACSS2 and acetate metabolism in more depth particularly as it concerns situations (i.e., diseases, pathologies) where either SIRT1, SIRT2, or both sirtuins, are active. This would enhance the significance of the findings to the broader research community.

      The study by Hallows et al. showed increased SIRT1 deacetylate K661 of ACSS2 and increase its catalytic activity. Subsequently, a follow-up investigation unveiled the role of the circadian clock in modulating intracellular acetyl-CoA levels through SIRT1-catalyzed K661 deacetylation of. Conversely, our research elucidates a contrasting mechanism wherein SIRT2 inhibits ACSS2 by deacetylating K271 under conditions of nutrient stress. The dual regulation of ACSS2 by SIRT1 through the circadian clock and SIRT2 under nutrient stress underscores the intricate and multifaceted nature of regulatory mechanisms involved in lipid metabolism. These findings underscore the versatility of lysine acetylation in modulating cellular metabolic pathways.

      Collectively, these studies contribute to a better understanding of how SIRT1 and SIRT2 regulate ACSS2 activity in various metabolic contexts, thereby enhancing our knowledge of acetate metabolism and its implications in health and disease.

      We have included such discussion of the manuscript.

      In Figure 3, the authors should consider immunoblotting for endogenous ACSS2 throughout the differentiation and lipogenesis study since the total ACSS2 levels is the crucial aspect to affecting acetate-dependent promotion of lipogenesis in adipocytes, and to confirm TM-dependent stabilization of ACSS2 in that assay.

      We have updated Figure 3 to include immunoblotting for endogenous ACSS2 levels. Additionally, we have confirmed the TM-dependent stabilization of ACSS2, which is now shown in Figure S12.

      Do the authors have any data proving the K271 mutants of ACSS2 are still functional? Or that K271 ACSS2 protein is folded correctly?

      To assess the functionality of the mutants, we isolated Flag-tagged wildtype, K271R, and K271Q ACSS2 proteins from SIRT2 knockdown HEK293T cells. Subsequently, we examined acetyl-CoA formation from acetate and CoA using high-performance liquid chromatography (HPLC). Our findings indicate that while the wildtype ACSS2 exhibits slightly higher activity compared to the K271R and K271Q mutants, but all variants remain functional (Figure S13).

      Nearly all experiments are performed in a single cell line. Authors should test whether SIRT2 regulates ACSS2 acetylation in at least 1 or 2 more cell lines. Does SIRT2 regulate ACSS2 acetylation in 3T3-L1 preadipocytes?

      Experiments showing that endogenous ACSS2 levels change in EBSS and nutrient-deprived media were repeated in A549 cells (Figure S5). However, due to the poor transfection efficiency of A549 cells, we were unable to obtain acetylation data. Similarly, conducting acetylation experiments in 3T3-L1 preadipocytes is challenging due to poor transfection efficiency.

      The article does not explicitly address whether the absence of amino acids impacts the acetylation and subsequent degradation of ACSS2 by activating SIRT2. If so, one would expect the level of ACSS2 acetylation or ACSS2 expression under amino acid deprivation to be lower than that under normal conditions, as depicted in Fig. 1C and Fig. S3.

      The experiments shown in Fig. 1C and Fig. S3 were using overexpressed Flag-tagged ACSS2 and we actually adjust the amount of DNA used to have similar Flag-ACSS2 levels.

      To address the comment raised by the reviewer, we added Figure S14, which shows that endogenous ACSS2 acetylation is decreased under amino acid deprivation in SIRT2 control KD cells, indicating that the absence of amino acids impacts ACSS2 acetylation. The decreased expression of ACSS2 under amino acid deprivation is also addressed in Figure S6.

      Several reviewers noted discrepancies between what is occurring to basal levels of ACSS2 vs in SIRT2 KD conditions. Fig. 2H shows higher basal level of acetylated ACSS2 in K271R mutant compared to wildtype (input may be an issue). If Fig. 2H is a critical piece of data, authors are recommended to show this using FLAP-IP & then Ac-K.

      The increased stability of the K271R mutant compared to the wildtype (WT) results in higher protein levels, which results in the different input levels. However, this does not affect the conclusion that K271 is the acetylation site as the quantification result shows that K271R mutant has lower acetylation level and is not regulated by SIRT2 (Figure S16).

      Regarding the basal levels of ACSS2 in control and SIRT2 KD conditions, it was because the experiments in question were using overexpressed Flag-tagged ACSS2 and we actually adjust the amount of DNA used to have similar Flag-ACSS2 levels. To address the concern, we monitored endogenous ACSS2 protein and acetylation levels and the results are shown in Figure S14.

      Also, in Fig 2I there is no difference in basal ubiquitination between WT and K271R mutant. Related, based on model you would expect that overexpression of ACSS2-K271R mutant compared to wildtype would be at higher levels. In many figures authors do not see this (Fig. 2I, 3A, 3B). This needs to be explained.

      This is related to some previous comments. In these experiments, we actually adjusted the DNA used in the transfection to obtain equal protein levels so that we can quantify other things (acetylation or ubiquitination levels). As stated in the manuscript regarding Figures 3A and 3B, "To ensure comparable expression levels at the beginning, we adjusted the amount of transfected DNA for both wild-type and the K271R mutant ACSS2." This approach allowed us to accurately compare the ubiquitination status between the wildtype and K271R mutant ACSS2 variants.

      Data showing role of ACSS2-K271 mutant in lipid accumulation requires clarification. Based on model overexpression of ACSS2-K271 mutant should by itself cause increased lipid accumulation compared to wildtype.

      This is indeed the case and we have added this in the revised manuscript “Consistent with our above observation that ACSS2 K271R mutant is more stable than the WT, expressing the K271R mutant lead to more lipid droplets than expressing the WT ACSS2 (Figure S12).”

      Loading controls are notably absent at certain instances, such as IPs in Fig. 1A, 1C, and the IP in Fig. 2H. Such controls are required to interpret potential changes in acetylation.

      For this experiment, we employed an approach where we overexpressed Flag-tagged wild-type (WT) and mutant forms of ACSS2. We conducted an immunoprecipitation (IP) targeting acetyl-lysine residues to enrich lysine-acetylated proteins, followed by immunoblotting for the Flag tag to specifically detect ACSS2 acetylation levels. To ensure the reliability of our results, we included a Flag blot to confirm equal expression levels of ectopically expressed ACSS2 across our samples before IP. Given the nature of our experimental design and the specific aim of investigating ACSS2 acetylation, we believe that additional loading controls beyond the input Flag blot are not required for the interpretation of our results. The inclusion of the input Flag blot serves as a control for protein expression levels, which is crucial for accurate assessment of ACSS2 acetylation status.

      While CHX treatment is known to inhibit protein synthesis, it appears contradictory that CHX treatment in Fig. 2C seemingly leads to ACSS2 accumulation in SIRT2 knockdown HEK293T cells. This discrepancy requires clarification.

      We conducted quantitative analysis of the immunoblot with replicates to ensure the reliability of our findings. Our analysis indicates that the protein level of ACSS2 remains relatively stable over the time course of CHX treatment. The observed slight increase at the 8-hour time point can be attributed to inherent experimental variability, as evidenced by the presence of large error bars in the graph. We have included a graph in Figure S7 to show that there is no significant change in the level of ACSS2 in the SIRT2 HEK293T cells.

      In Fig. 2F-H, the authors argue that SIRT2 deacetylates ACSS2 to facilitate its ubiquitination and subsequent proteasomal degradation. However, these results are depicted under normal conditions, whereas findings in Fig. 1 suggest that SIRT2 deacetylates ACSS2 exclusively under nutrient stress. An explanation for this inconsistency is warranted.

      These experiments were done in amino acid deprived (EBSS) media. We have corrected this in the manuscript.

      Line 160 authors conclude "amino acid limitation..deacetylates K271"..but this was not directly demonstrated. Authors should add this data or change conclusion.

      Addressed in response to some of the comments above.

      Figures 1A and 1B, acetylation quantification, not clear if it is relative to the Flag tag or actin.

      Acetylation quantification is relative to Flag tag. This is clarified in the figure legend.

      Methods section lacking details & not well referenced (how did authors express wildtype & mutant in 3T3-L1 cells?) 

      ACSS2 wildtype and K271R mutant Flag-tagged expression plasmids were transfected into ACSS2 knockdown 3T3-L1 cells using PEI transfection reagent following the manufacturer’s protocol. The pCMV-Tag4a empty vector was used as the negative control. Differentiation of 3T3L1 cell lines were done according to manufacturer’s protocol (DIF001-1KT, Sigma Aldrich) 24 hours after transfection. This has been included in the methods.

      In Figure 3A, is the actin blot from the same immunoblots above it? Reviewers recommend the authors upload original immunoblot.

      This experiment was repeated, and the blot has been replaced.

    1. eLife assessment

      This study uses ex vivo live imaging of uteri post-mating to test the role of the sperm hook in the house mouse sperm in sperm movement that would be interesting to evolutionary biologists. The significance of the work is useful as live imaging can reveal information not seen in fixed images. The strength of evidence is incomplete as they cannot directly test the role of the sperm hook in facilitating movement along the uterine wall.

    2. Reviewer #1 (Public Review):

      Summary:

      The authors want to determine the role of the sperm hook of the house mouse sperm in movement through the uterus. They use transgenic lines with fluorescent labels to sperm proteins, and they cross these males to C57BL/6 females in pathogen-free conditions. They use 2-photon microscopy on ex vivo uteri within 3 hours of mating and the appearance of a copulation plug. There are a total of 10 post-mating uteri that were imaged with 3 different males. They provide 10 supplementary movies that form the basis for some of the quantitative analysis in the main body figures. Their data suggest that the role of the sperm hook is to facilitate movement along the uterine wall.

      Strengths:

      Ex vivo live imaging of fluorescently labeled sperm with 2-photon microscopy is a powerful tool for studying the behavior of sperm.

      Weaknesses:

      The paper is descriptive and the data are correlations.

      The authors cannot directly test their proposed function of the sperm hook in sliding and preventing backward slipping.

    3. Author response:

      The following is the authors’ response to the original reviews.

      Thank you for your time and consideration on our submission. We also thank the reviewers for their consideration and helpful comments.  We have revised the introduction, results, and discussion sections of the revised manuscript in accordance with the reviewers’ suggestions, which have enhanced the clarity of our work. Specifically, we have clarified that the aim of the study is to report newly discovered sperm behaviours inside the uterus via high resolution deep tissue live imaging, and to stimulate further studies and discussion in the field of postcopulatory sexual selection in mice based on our observations. To the best of our knowledge, many of the specific sperm behaviours described in our manuscript are being reported for the first time, proven through direct observation inside the living reproductive tract.

      We have also restructured our manuscript and moved our hypothetical interpretations based on our experimental observations to the discussion section. We hope that these revisions have clarified our claims and that our revised manuscript effectively communicates the importance of our findings and its values in prompting new questions and insight that encourage further studies. We believe that our work clearly demonstrates the importance of sperm/reproductive tract interaction, which cannot be adequately studied in artificial environments, and may become an important guideline for designing future experiments and studies.

      Public Reviews: 

      Reviewer #1 (Public Review): 

      Summary: 

      The authors want to determine the role of the sperm hook of the house mouse sperm in movement through the uterus. The authors are trying to distinguish between two hypotheses put forward by others on the role of the sperm hook: (1) the sperm cooperation hypothesis (the sperm hook helps to form sperm trains) vs (2) the migration hypothesis (that the sperm hook is needed for sperm movement through the uterus). They use transgenic lines with fluorescent labels to sperm proteins, and they cross these males to C57BL/6 females in pathogen-free conditions. They use 2-photon microscopy on ex vivo uteri within 3 hours of mating and the appearance of a copulation plug. There are a total of 10 post-mating uteri that were imaged with 3 different males. They provide 10 supplementary movies that form the basis for some of the quantitative analysis in the main body figures. Their data suggest that the role of the sperm hook is to facilitate movement along the uterine wall. 

      We thank the reviewer for summarizing our work and the critical review of our paper. As summarized, the sperm hook has been primarily associated with the sperm cooperation (sperm hook) hypothesis and the migration hypothesis. However, we would like to emphasize that the aim of our work is not to cross check between the two hypotheses. Our aim was not to disprove either hypothesis, but rather to develop an experimental platform that enables detailed observation of sperm migration dynamics within the live reproductive tract. 

      Through live imaging, we observed both the formation of sperm trains as well as interaction between the sperm and female reproductive tract epithelium. However, in our observations, we could not find advantage in terms of faster movement for the rarely observed sperm trains. While these events were infrequent in our experiments, we are not asserting that the sperm train hypothesis is invalid but rather reporting our observations as is. 

      The main findings of our work lie in the newly observed dynamic behaviours of mouse sperm interacting with the female reproductive tract epithelium. Specifically, tapping and associated guided movement along the uterus wall, anchoring and related resistance to internal fluid flow and migration through the utero-tubal junction, and self-organized behaviour while clinging onto the colliculus tubarius. We have extensively revised the manuscript structure to clarify our findings.

      Strengths: 

      Ex vivo live imaging of fluorescently labeled sperm with 2-photon microscopy is a powerful tool for studying the behavior of sperm. 

      Weaknesses: 

      The paper is descriptive and the data are correlations. 

      The data are not properly described in the figure legends. 

      When statistical analyses are performed, the authors do not comment on the trend that sperm from the three males behave differently from each other. This weakens confidence in the results. For example, in Figure 1 the sperm from male 3613 (blue squares) look different from male 838 (red circles), but all of these data are considered together. The authors should comment on why sperm across males are considered together when the individual data points appear to be different across males. 

      Thank you for your comments and suggestions. We have revisited all figure legends and made the necessary amendments (shown in the red-lined manuscript). Please note that, for a better flow of the paper, the previous Figure 1 has been changed to Figure 2 in the revised manuscript.

      Regarding the analysis using different males, we would like to explain the statistics used. We used generalized linear mixed models to test the effect of the Angle and Distance to the wall on the migration kinetic parameters. The advantage of the generalized linear mixed models is that they consider individual variations in the data as an error term, thereby controlling such individual variations. 

      There are two main factors contributing to individual variations. One is, as you pointed out, the difference in sperm from different males. However, we used genetically similar mice, so genetical variations must be minimal. Nonetheless, there must be individual differences that caused variations including age, stress level as well as body conditions. As these factors cannot be controlled, we used the mixed model approach where individual variations are grouped within the individual. This approach enabled us to test the effect of each explanatory variable (Angle and Distance) within an individual. 

      The second factor that could cause variations is the female oestrous status. To avoid artifacts that could influence sperm behaviour, we did not use any invasive methods, such as hormone injections, to control or induce female oestrus. We controlled for this possible effect by including the mating date as a random effect. Since each female was used only once, the mating date reflects the variation caused by each female.

      To provide further verification that the variation between individual males do not affect our results, we conducted analysis per individual male and mating dates (per each female). As clearly shown, sperm data points from individual males or female also show consistent clear correlations with the distance from the uterus wall. As pointed out, while the mean sperm speed could be different between individuals, they are not the topic we are interested in here. Our interest here is the effect of the distance between sperm and the uterine wall. Additionally, the variation between males is not always larger than those effect of the day (female), which in total suggest that integrating male variation is not essential. We have added this information to Supplementary Figure (Fig. S3) of the revised supplementary materials.

      Moving forward, we can also consider the same analysis for the effects of the distance from wall on sperm SWR and LIN (linearity of forward progression) where no statistical significance was found. As see in the following figures, no statistically significant effect of the distance to wall on SWR and LIN are seen in that the regression lines drawn for each male and mating dates.

      In summary, the statistical approach we used here has successfully reflected variations in sperm kinetics from different males as well as the variance from different females. We hope that our explanations and additional analysis answer your concerns. 

      Movies S8-S10 are single data points and no statistical analyses are performed. Therefore, it is unclear how penetrant the sperm movements are. 

      With respect to Movie S8, Figure 4A and B (Figure 5A and B in the current revised manuscript) depict the trajectories of accumulated spermatozoa (sperm trains) in the female uterus, as shown in Movie S8. We have added this information to the revised figure legend (L 293) for clarity. We could not observe sperm trains that moved faster than single sperms during over 100 hours of observation and collection of over 10TB of images. The three sperm trains presented in Fig. 5B were the sperm trains that moved in the head-forward direction. Most other identifiable trains, or clusters, did not move or could not move forward as their heads were entangled randomly. Although we of course agree that a statistical test for Movie S8 (also Fig. 5B) would be great, due to the small number of sperm trains we found, we could not perform meaningful statistical tests. Instead, we provided all data in the box plots in Fig. 5C so that readers can evaluate and understand our points. We believe that this is a more neutral way of presenting our data rather than providing statistical significance.

      Regarding Movies S9 and S10, we are not entirely sure whether we understood your comments clearly. It would be very helpful if you could point out more specifically to the manuscript with line numbers as we would like to address your concerns and suggestions, and we believe that your input will improve our manuscript. We did not describe the penetration of sperm in these movies. Movies S9 and S10 are newly found sperm behaviours inside the UTJ and Isthmus. We observed that sperm beating is influenced by the width of luminal space as well as internal flow as see in Movies S9 and S10. As our animal model only expresses red fluorescence in the midpiece, accurate beating frequency measurement cannot be performed. However, we can clearly observe that beating is not continuous and almost results in a halt with respect to reproductive tract variations. We revised our description about the findings about beating speed changes in the revised manuscript (LL 305-335).  

      Movies S1B - did the authors also track the movement of sperm located in the middle of the uterus (not close to the wall)? Without this measurement, they can't be certain that sperm close to the uterus wall travels faster. 

      We revised the new Movie S1B to include videos that were used for the sperm migration kinetics analysis in Figure 2 (previously Figure 1). As you can see in the movies, the graph, and statistical analysis, there is a clear trend showing spermatozoa migration is slower as a function of distance from the uterus wall. Regarding your comment with respect to the middle of the uterus (not close to the wall), we have added another movie (Movie S1C) that was acquired at different depths from the wall (going towards the centre of the uterus). As clearly seen in Movie S1c, when imaging deeper into the uterus, there are an increasing number of inactive or slow-moving spermatozoa. Since the diameter of the uterus is easily over 2mm, we currently do not have optical access to exactly the centre of the uterus, but for all depths that are observable, spermatozoa near the wall were clearly faster.

      Movie S5A - is of lower magnitude (200 um scale bar) while the others have 50 and 20 uM scale bars. Individual sperm movement can be observed in the 20 uM (Movie 5SC). If the authors went to prove that there is no upsucking movement of sperm by the uterine contractions, they need to provide a high magnification image. 

      The main focus of video S5A, is the intramural UTJ where spermatozoa are located in rows within narrow luminal space (see Author response image 1). When there is up-suck like sperm passive carriage, there must be sperm movement from the uterus to intramural UTJ as in Author response image 1 left. However, there is no such sperm movement could be seen in our observations, as shown in Movie 5A. Importantly, as you can see in Movie 5A, indicated by an arrow from 5 sec to 6 sec, some spermatozoa are moving downward (see also Author response image 1 right). This is the opposite direction of movement with respect to possible up-suck like sperm carriage. 

      Genetical evidence also support up-suck like passive sperm carriage is not the case for sperm migration from the uterus to UTJ. If environmental up-suck like passive transfer plays an important role, it is unlikely that genetically modified spermatozoa cannot pass the entrance of the intramural UTJ (Nakanishi et al., 2004, Biol. Reprod.; Li et al., 2013, J. Mol. Cell Biol.; Larasati et al., 2020, Biol. Reprod.; Qu et al., 2021, Protein Cell). 

      Author response image 1.

      The left image represents what is expected when up-suck like passive sperm carriage occurs. The right image represents what is actually experimentally observed in the intramural UTJ (see Movie S5A). The direction of the arrowheads indicates the direction of sperm movement.

      Movie S8 - if the authors want to make the case that clustered sperm do not move faster than unclustered sperm, then they need to show Movie S8 at higher magnification. They also need to quantify these data. 

      We understand your concern. As shown in Figure 5B, we included all sperm kinetics data of each sperm train and unlinked spermatozoon around the trains as individual dots. The only analysis we did not conduct was a statistical test with the data as it could be erroneous due to the large sample size difference (3 trains vs 181 unlinked spermatozoa). As the medians of the four sperm kinetic parameters are similar except SWR, we concluded that they are not necessarily faster than unlinked single spermatozoa. Since there is no known advantage to spermatozoa (including sperm trains) with intermediate moving speeds for sperm competition – for example in IVF, success fertilization rate is high when faster and active spermatozoa with normal shape are selected (Vaughan & Sakkas, 2019, Biol. Reprod.) – it is questionable whether there can be an advantage to the formation of sperm trains whose speed is not faster than unlinked spermatozoa in our data.

      However, we do not agree with your comment regarding the need for higher magnification. Measurement of the sperm migration speeds (kinetic parameters) does not require measurement of exact tail movements in this study. Only sperm heads were tracked to measure their trajectory and such tracking was better done at low mag. For example, measuring the speed of a car does not need higher magnifications to visualize the rotation of the wheels. Additionally, including the effect of observation magnification on the sperm kinetic parameters for all 4 GLMM models for Figure 2 (Table S3) does not change the result, which shows that magnification is not a factor that influences our analysis. 

      Movie S9C - what is the evidence that these sperm are dead or damaged? 

      Thank you for your valid comment. We tracked sperm movements for at least 10 minutes and such entangled spermatozoa in the UTJ never became re-active. As you can see in the new Movie S9b, entangled spermatozoa were also acrosome re-acted (green acrosome head is gone) while active spermatozoa are responding to peristaltic movement by exhibiting movements within the same video. However, as you pointed out, we did not measure their viability with appropriate dyes. Although we also considered about extracting these spermatozoa and performing viability tests, we could not come up with a way to specifically extract the exact spermatozoa that were imaged. Considering your comments, we changed the term damaged or dead to inactive in the revised manuscript (LL 313-316, Legend Figure 6D. LL 380-384).

      Movie S10 - both slow- and fast-moving sperm are seen throughout the course of the movie, which does not support the authors' conclusion that sperm tails beat faster over time. 

      There must have been a misunderstanding. We did not indicate that sperm beating got faster over time anywhere in the main manuscript, including the figure legend and related movie captions. As correctly pointed out, the sperm beating speed changes over time (not getting faster over time) and shows a correlation with internal fluid flow and width of luminal space (LL 320-332). Please let us know if you meant something else. 

      Reviewer #2 (Public Review): 

      Summary: 

      The specific objective of this study was to determine the role of the large apical hook on the head of mouse sperm (Mus musculus) in sperm migration through the female reproductive tract. The authors used a custom-built two-photon microscope system to obtain digital videos of sperm moving within the female reproductive tract. They used sperm from genetically modified male mice that produce fluorescence in the sperm head and flagellar midpiece to enable visualization of sperm moving within the tract. Based on various observations, the authors concluded that the hook serves to facilitate sperm migration by hooking sperm onto the lining of the female reproductive tract, rather than by hooking sperm together to form a sperm train that would move them more quickly through the tract. The images and videos are excellent and inspirational to researchers in the field of mammalian sperm migration, but interpretations of the behaviors are highly speculative and not supported by controlled experimentation. 

      Thank you for your critical review and valuable comments on our manuscript. As pointed out, some of our findings and suggestions were largely observation based. However, to the best of our knowledge, many of our observations are novel, particularly in the context of live imaging inside the female uterus and reproductive tract. We believe these observations open doors to many questions and follow up studies that can be envisioned based on our findings, which is what drives science forward. 

      That being said, we entirely agree that many follow up experiments need to be designed and performed, especially to validate the exact molecular mechanisms of the observed dynamics. We acknowledge that it is unfortunate we currently lack the proper molecular experimental toolsets to perform further tests. We have removed much of the hypothetical discussions from the results section and moved them to the discussion section. We hope that our revision more clearly defines the observed experimental data and our interpretations.

      Strengths: 

      The microscope system developed by the authors could be of interest to others investigating sperm migration. 

      The new behaviors shown in the images and videos could be of interest to others in the field, in terms of stimulating the development of new hypotheses to investigate. 

      Weaknesses: 

      The authors stated several hypotheses about the functions of the sperm behaviors they saw, but the hypotheses were not clearly stated or tested experimentally. 

      The hypothesis statements were weakened by the use of hedge words, such as "may". 

      We appreciate your helpful comments and have revised our hypotheses and suggestions accordingly. We have removed instances of “may” or revised it to be more direct. We have also moved most of our interpretations and hypotheses from the results to the discussion section. 

      It is important to note that experimental approaches to test what we suggested from our findings in the current ex-vivo observation platform are not trivial and require extensive investigation of several unknown factors of the female reproductive tract. For instance, obtaining detailed information on the chemical characteristics and fluid dynamics in the female reproductive tract is essential to build a microfluidic channel that accurately resembles the uterus and oviduct, replicating what we found in an extracted living entire organ. This poses a significant challenge and requires collaborative expertise from many labs, which we hope to build in the near future. 

      Furthermore, our biggest concern is that, even if we were to construct the appropriate microfluidic channel to test sperm migration, it is very likely that the sperm behaviours that we observed under natural conditions may not be replicated in artificial environments. This raises questions about whether in-silico or in-vitro findings can truly resemble what we reported here using the ex-vivo observation inside a living organ.

      To share our experience related to this difficulty, at the initial stage of our study, we attempted sperm injection combined with fluorescent beads to visualize the fluid flow, as well as dyeing the female reproductive tract and spermatozoa after mating. However, none of these resulted in meaningful results. Another potential approach to perform similar research regarding our claims is using genetical engineering to indirectly confirm the influence of the sperm hook morphology on sperm behaviour. However, such an approach lacks a mechanical demonstration about how the sperm hook interacts with the female reproductive tract. 

      It is unfortunate that the sperm behaviours that we found and reported here are considered as highly speculative. The main findings of our work lie in the newly observed dynamic behaviours of mouse sperm interacting with the female reproductive tract epithelium. Specifically, these behaviours include tapping and associated guided movement along the uterus wall, anchoring and related resistance to internal fluid flow and migration through the utero-tubal junction, and self-organized behaviour while clinging onto the colliculus tubarius. 

      We have extensively revised the manuscript structure to clarify our findings and integrated our points in the introduction. Although we understand our following hypotheses may be considered speculative and the causative relationship between the sperm hook and its role in sperm migration requires further experimental approaches, we believe that the image-based observation of dynamic behaviours of spermatozoa are solid. We believe our findings will facilitate further studies and discussion in the field of studies on postcopulatory sexual selection in rodents.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors): 

      The manuscript is written for an expert in a fairly small field. I recommend that the authors rewrite the manuscript to make it more accessible to people outside of the field. These suggestions include 

      (1) Provide a diagram of the female reproductive tract in Figure 1. 

      a. Indicate where sperm enter the tract and the location of the oocyte they are trying to reach. 

      b. Label all areas of the uterus that are mentioned in this study and be consistent about the label. 

      (2) All movies should have a diagram of the location of the uterus that is being imaged. 

      Thank you for the great suggestion. We have added a diagram of the female reproductive tract in the revised Figure 1A. In response to your comments 1a and b, we have indicated such information by including eggs in the ampulla and arrows that indicate sperm migration direction. We have also labelled the name of the specific areas that were studied in the manuscript.

      We are unsure how to integrate the diagram in all movies without reframing the videos, which could cause serious corruption of the files. More importantly, we think that adding the same diagram to all movies may complicate the visuals and disrupt indications and subject in the movie. Instead, we have referred to the common diagram (Figure 1A) in each movie caption, specifying where the video was taken. Thank you for the suggestion. With this information, we hope readers can now more easily understand where we made the observations. 

      (3) The major questions in the field need to be better described in the introduction. 

      Thank you for your valuable suggestions and specific comments which have greatly helped improve our manuscript. We have revised our introduction and discussion sections by adding more literature reviews and integrating studies across a wider range of the postcopulatory sexual selection, as per your suggestion (LL 34-57, LL 385-398).

      (4) The major question that the authors are trying to address should be described in the introduction. 

      Thank you for the helpful suggestion. We have clarified in the introduction that our aim was to contribute to the field of postcopulatory sexual selection in rodents by advancing methodological progress and to stimulate discussion and future research on the function of the sperm hook in murine rodents (LL 76-94) based on our observations.

      (5) A discussion of the sperm hook should be provided. How many species have this structure (or similar structure)? 

      We have integrated your point into the revised discussion section. Essentially, most murine rodent species have sperm hooks (while their exact shapes differ). However, as there are over 500 species and not all of them have been tested, we do not know exactly how many of them have this structure. Therefore, we included paper references that examined species variations in sperm hook characteristics and their possible correlation with sperm competition (LL 385417) in the discussion. Additionally, we also included papers by Breed (2004) and by Roldan et al (1992) that investigated murine rodents with a sperm hook in the introduction section as well (LL 58-61).  

      (6) The figure legends must describe everything in the figure or movie. 

      Thank you for the helpful suggestion. We previously thought that our figure legends may be too long. We have included further information in the figure legends and movie captions. We have also revised the movies by adding some clips following our revision (Movie S1).

      Reviewer #2 (Recommendations For The Authors): 

      Here are some specific concerns I had about the clarity of approach to experiments and interpretations of results. 

      In the Introduction, the authors stated that the study was intended to determine the function of the hooks on the mouse sperm heads. However, in the Results section, the authors did not explain the rationale for the first set of experiments with respect to the overall objective of the study. In this experiment, the authors measured the velocities of sperm swimming in the uterus and found that the sperm moved faster when closer to the uterine wall (VCL, VSL). They concluded that migration along the uterine wall "may" be an efficient strategy for reaching the entrance to the uterotubal junction (UTJ) and did not explain how this related to the function of the hooks. 

      Thank you for your critical comment and guidance. We have changed the order of Figure 1 and Figure 2 and revised the result section to integrate your points. At the initial stage of the study, we expected to find evidence of the function of sperm trains in aiding sperm migration in the female uterus (which has not been observed in the live uterus; previous works were done invitro with extracted sperm from epididymis or uterus after mating). However, what we found was something unexpected: dynamic sperm hook related movements facilitating sperm migration inside the female uterus by playing a mechanical role in sperm interaction with the uterine wall. These results that were presented in the previous Figure 2 has been reorganized as the new Figure 1.

      Based on this observation, our research later moved to clarify whether such sperm-epithelium interaction indeed helps sperm migration. This led us to measure sperm kinetics in relation to their distance and angle to the uterine wall. We have revised our introduction and result parts by integrating these points. We hope that our revision will answer your questions. We have also reduced the use of ‘may’ or ‘can’ in the results section. In the revised manuscript, we have moved such hypotheses to the discussion section and focused on what we observed in the results section.

      The authors proposed that the sperm hook "may" play a crucial role in determining the direction of migration. When sperm encountered a uterine wall, significantly more changed migration direction toward the pro-hook direction than toward the anti-hook direction. In Figure 2B, sperm behavior is not visually understandable nor clearly explained. 

      Thank you for the helpful comments. We have removed “may” and “might” to make our claim clearer and more concise. We have also revised the previous Figure 2B by combining it with the previous Figure 2C (they have been combined into Figure 1C now). We have also revised Figure 1B by increasing the line thickness of the sperm trajectory of the pro-wall-hook direction and added the anti-wall-hook trajectory. We hope that these revisions make the figure easier to understand.

      In Figure 2E, are the authors showing that the tip of the hook is caught between two epithelial cells? Please clarify the meaning of this figure. 

      Please clarify the difference between "tapping" and "anchoring". 

      Thank you for the detailed comments. As you pointed out, we currently have no evidence whether sperm can be caught in epithelia inter-cellular gaps. We have revised this source of confusion by removing the gap in the revised figure (Figure 1E). We have also included the definition of anchoring (LL 142-143) and tapping (LL 128-130). Anchoring facilitates the attachment of sperm to the uterine epithelia. Such anchoring also involves the catching of the sperm head in the inter-mucosal fold or gap, particularly at the entrance of the intramural UTJ at the end of the uterus. Tapping is the interaction between the head hook and epithelia in which the sperm hook is tapping (or patting) on the surface. Sperm tapping can be a byproduct that results from flagella beating when spermatozoa migrate toward the pro-wall-hook direction along the uterine wall (epithelia) or can play some role in sperm migration. As we currently cannot draw a conclusion, we did not integrate the possible function of the tapping in the manuscript.

      The authors proposed that opposite sliding of neighboring mucosal folds lining the UTJ would cause small openings to form, through which only perhaps one sperm at a time could enter and pass through the UTJ into the uterus. This hypothesis was not actually tested. 

      Imaging inside deep tissue is challenging due to light scattering as it penetrates through biological tissue. While this is also true for the uterus, the intramural UTJ is especially difficult to image because the UTJ consists of several thick muscle and cell layers (see Movie S5A). Another challenge is that the peristaltic movement of the UTJ results in constant movement, making continuous tracking of single sperms while passing through the entirety of the UTJ impossible in our current experiments. We have moved this hypothesis to the discussion section and restated that this is a pure hypothetical model (LL 399-406). We hope that our model encourages the community in designing or establishing an improved ex-vivo observation system that may be able to test this hypothetical model in the near future.

      Next, the authors hypothesized that sperm that encounter the small openings in the UTJ may then be guided onward and the hooks could prevent backward slipping. This was also not tested. 

      As you’ve noted, the function of the sperm hook that aids in sliding and preventing backward slipping could not be tested directly in our ex-vivo observation platform that relies on natural movement of the living organ. However, we believe that these limitations also highlight the importance of continued research and the development of more advanced methodologies in this field.

      We would also like to note that we provide direct observations of spermatozoa resisting internal flow due to reproductive tract contractions in Movie S3A, B as well as Movie S5B. We referred to these movies and pointed out the role of anchoring (sperm attachment) in preventing sperm from being squeezing out (LL 140-149, LL 224-241). Unfortunately, we cannot conceive of how this behaviour can be tested additionally in any uterus-resembling microfluidic device or ex-vivo systems. In line with your suggestion, we have rewritten the related result section and moved our related discussions in the result part to the discussion section (LL 224-241, LL 399-417). 

      The authors observed that large numbers of uterine sperm are attached to the entrance of the UTJ. Some sperm clustered and synchronized their flagellar beating. The authors speculated that this behavior served to push sperm in clusters onward through the UTJ. 

      We would like to note that we did not speculate that sperm clustering and their synchronization could serve to push spermatozoa in a cluster to move onward through the UTJ. We only pointed out our observation in recorded videos, that generative flow from the clustered spermatozoa pushed away other spermatozoa as seen in Movie S7 (LL 261-264). Although such sperm cooperation is possible (blocking passage of later sperm), we cannot draw that conclusion from our observation. The possibility you pointed out (pushing sperm onward through the UTJ) was suggested by Qu et al in 2021 [Cooperation-based sperm clusters mediate sperm oviduct entry and fertilization, Protein & Cell] based on their observations on cleared dead reproductive tracts.

      The authors found only a few sperm trains in the uterus, UTJ, and oviduct, so they could not measure sufficient numbers of samples to test whether sperm trains swim faster than single sperm. Without sufficient data, they concluded that the "sperm trains did not move faster than unlinked single spermatozoa." 

      We would like to take this opportunity to clarify our claims. We do not claim that our current experiments can give the final verdict on whether the sperm train hypothesis for faster swimming is correct or not. The phrase “sperm trains did not move faster” was not intended to mean that the sperm train hypothesis is invalid.  We did not draw a conclusion but dryly described the experimental data that we observed (LL 279-286).  We would once again like to emphasize that the main claim of our manuscript is not to rule out the sperm train hypothesis, but to present the various dynamic interactions of the sperm head with the female reproductive tract. To make the statement more balanced, we revised the sentence as “observed sperm trains did not move faster or slower than unlinked single spermatozoa” (LL 281-282).

      The authors hypothesized that the dense sperm clusters at the entrance into the UTJ could prevent the rival's sperm from entering the UTJ (due to plugging entrance and/or creating an outward flow to sweep back the rival's sperm), but they did not test it. 

      We agree that we were not able to test such possible function of the sperm cluster at UTJ entrance. Following your concerns, we revised the result part (LL 256-264) by removing most of our discussions related to the observed phenomena. We also integrated some interpretation rather to the discussion section (LL 421-437) and suggested that future works using appropriate microfluidic channel designs or sequential double mating experiments may be performed for additional tests (LL 443-447). However, we would like to point out that Movie S7C clearly shows surrounding sperms that are swept away from the sperm clusters. Since the sperm density is high, this is almost equivalent to a particle image velocimetry experiment, and we can clearly see the effect of the outward flow generated by the sperm clusters.

    1. eLife assessment

      This valuable study combines multidisciplinary approaches to examine the role of insulin-like growth factor 2 mRNA-binding protein 2 (IGF2BP2) as a potential novel host dependency factor for Zika virus. The main claims are supported by the data but remain incomplete. The evidence would be strengthened by improving the western blot analyses and adjusting the toning of their claims in relation to the role of IGF2BP2 for viral replication. With the experimental evidence strengthened, this work will be of interest to virologists working on flaviviruses.

    2. Reviewer #1 (Public Review):

      Summary:

      This study investigated the co-option of IGF2BP2, an RNA binding protein by ZIKV proteins. Designed experiments evaluated if IFG2BP2 co-localized to sites of viral RNA replication, interacted with ZIKV proteins and how ZIKV infection changed the IGF2BP2 interactome.

      Strengths:

      The authors have used multiple interdisciplinary techniques to address several questions regarding the interaction of ZIKV proteins and IGF2BP2.

      The findings could be exciting if concerns are addressed, specifically regarding how ZIKV infection alters the interactome of IGF2BP2.

      Comments on thee revised version:

      Following response to reviews, the authors have addressed a majority of the concerns with the exception of the western blots:

      As requested in the previous review, the authors did quantify the western blot data for half of the blot in 2A, but did not quantify blots in D and E. Please quantify ALL blots. Also, the first two lanes of 2A. The same goes for 4A only infected is quantified, please quantify Mock as well. In the quantification of 4C, all lanes should be quantified, not only the NS5 from C. Also, unclear which lanes were quantified (H/PF/2013 or MR766)? Also, quantification needs to be generally shown as a graph and not included on top of the western blot.

    3. Reviewer #2 (Public Review):

      Clément Mazeaud et al. identified the insulin-like growth factor 2 mRNA-binding protein 2 (IGF2BP2) as a proviral cellular protein that regulates Zika virus (ZIKV) RNA replication by modulating the biogenesis of virus-induced replication organelles. Based on their findings and previously published data, the authors propose a model outlining the role of IGF2BP2 in the ZIKV infectious cycle. This model details the changes in IGF2BP2 interactions with both cellular and viral proteins and RNAs during viral infection.

      Strengths:

      This revised manuscript presents an interesting and convincing mechanism by which a cellular RNA-binding protein alters its protein and RNA interactome during viral infection. Using various molecular biology methods, proteomic analysis and a newly described replication-independent vesicle packets induction system, the authors describe the relevance of IGF2BP2 protein during Zika virus infection.

      Weaknesses:

      In the proposed model, the IGF2BP2 protein specifically binds to the 3' nontranslated region (NTR) of the ZIKV genome, while excluding binding to the 5' NTR. However, the authors cannot rule out the possibility that this host protein associates with other regions of the viral genome, a topic which is discussed in the manuscript.

      In this study, the physiological cellular consequences of altering the interaction of IGF2BP2 with its endogenous mRNA ligands due to ZIKV infection remain unexplored. This aspect would be of interest for future studies.

    4. Reviewer #3 (Public Review):

      Summary:

      The manuscript by Mazeaud and colleagues pursued a small scale screen of a targeted RNAi library to identify novel players involved in Zika (ZIKV) and dengue (DENV) virus replication. Loss-of-function of IGF2BP2 resulted in reduced titers for ZIKV of the Asian and African lineages in hepatic Huh7.5 cells, but not for either of the four DENV serotypes nor for West Nile virus (WNV). The phenotype was further confirmed in two additional cell lines and using a ZIKV reporter virus. In addition, using immunoprecipitation assays the interaction between IGF2BP2 and ZIKV NS5 protein and RNA genome was detected. The work addressed the role of IGF2BP2 in the infected cell combining confocal microscopy imaging, and proteomic analysis. The approach indicated an altered distribution of IGF2BP2 in infected cells and changes in the protein interactome including disrupted association with partner mRNAs and modulation of the abundance of a specific set of protein partners in IGF2BP2 immunoprecipitated ribonucleoprotein (RNP) complexes. Finally, based on the changes in IGF2BP2 interactome and specifically the increment in the abundance of Atlastin 2, biogenesis of ZIKV replication organelles (vRO) is investigated using a genetic system that allows virus replication-independent assembly of vRO. Electron microscopy showed that knock down of IGF2BP2 expression reduced the number of cells with vRO.

      Strengths:

      The role of IGF2BP2 as a proviral factor for ZIKV replication is novel

      The study follows a logical flow of experiments that altogether support the assembly of a specialized RNP complex containing IGF2BP2 and ZIKV NS5 and RNA genome

      Weaknesses:

      The specificity for the direct interaction between IGF2BP2 and ZIKV RNA genome remains elusive in particular regarding the regions in the virus genome that drive interaction.

    5. Author response:

      The following is the authors’ response to the original reviews.

      This valuable study combines multidisciplinary approaches to examine the role of insulin-like growth factor 2 mRNA-binding protein 2 (IGF2BP2) as a potential novel host dependency factor for Zika virus. The main claims are partially supported by the data, but remain incomplete. The evidence would be strengthened by improving the immunofluorescence analyses, addressing the role of IGF2BP2 in "milder" infections, and elucidating the role of IGF2BP2 in the biogenesis of the viral replication organelle. With the experimental evidence strengthened, this work will be of interest to virologists working on flaviviruses.

      We thank the reviewers for their feedback and constructive suggestions. In this revised version of the manuscript, we have addressed the reviewer’s comments to the best of our ability as detailed below. We believe that the newly incorporated data strengthens our study and conclusions. We hope that this revised manuscript will satisfy the reviewers and will be of high interest to flavivirologists.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This study investigated the co-option of IGF2BP2, an RNA-binding protein by ZIKV proteins. Designed experiments evaluated if IFG2BP2 co-localized to sites of viral RNA replication, interacted with ZIKV proteins, and how ZIKV infection changed the IGF2BP2 interactome.

      Strengths:

      The authors have used multiple interdisciplinary techniques to address several questions regarding the interaction of ZIKV proteins and IGF2BP2.

      The findings could be exciting, specifically regarding how ZIKV infection alters the interactome of IGF2BP2.

      We thank the reviewer for acknowledging the multidisciplinary approach of our study and its exciting potential.

      Weaknesses:

      Significant concerns regarding the current state of the figures, descriptions in the figure legends, and the quality of the immunofluorescence and electron microscopy exist.

      In this new version of the manuscript, we have improved the quality of the microscopy data and included the requested information in the figure legends as described below in the Recommendations section.

      Reviewer #2 (Public Review):

      Clément Mazeaud et al. identified the insulin-like growth factor 2 mRNA-binding protein 2 (IGF2BP2) as a proviral cellular protein that regulates Zika virus RNA replication by modulating the biogenesis of virus-induced replication organelles.

      The absence of IGF2BP2 specifically dampens ZIKV replication without having a major impact on DENV replication. The authors show that ZIKV infection changes IGF2BP2 cellular distribution, which relocates to the perinuclear viral replication compartment. These assays were conducted by infecting cells with an MOI of 10 for 48 hours. Considering the ZIKV life cycle, it is noteworthy that at this time there may be a cytopathic effect. One point of concern arises regarding how the authors can ascertain that the observed change in localization is a consequence of the infection rather than of the cytopathic effect. To address this concern, shorter infection periods (e.g., 24 hours post-infection) or additional controls, such as assessing cellular proteins that do not change their localization or infecting with another flavivirus lacking the IGF2BP2 effect, could be incorporated into their experiments.

      We thank the reviewer for these relevant comments regarding the specificity of IGF2BP2 relocalization to the ZIKV replication compartment.

      It is noteworthy that we chose the 2-day post-infection time point for our analyses because it corresponds to the peak of replication with much more titers produced compared to those at 24 hours post-infection (generally ~106 PFU/mL vs. ~104 PFU/mL). Consistently, the abundance of viral replication factories is more obvious at this time-point. A MOI of 5-10 was chosen to maximize the % of infected cells. That said, as suggested by the reviewer, we have analyzed the distribution of IGF2BP2 in ZIKV-infected cells at one-day post-infection, and we provide evidence in Figure S1 that IGF2BP2 relocalizes to the dsRNA-containing compartment at this time point.

      Importantly, we now show in Figure S5 that in contrast to IGF2BP2, other host RNA-binding proteins such as LARP1 and DDX5 do not accumulate to ZIKV replication compartment at 2 days post-infection. LARP1 actually seems to be excluded from it while DDX5 remains nuclear. Of note, consistent with the ZIKV-induced decrease in expression observed in western blots (Fig 4A), the intensity of DDX5 signal decreases in infected cells. Altogether, this demonstrates that the IGF2BP2 relocalization phenotype is specific and is not due to ZIKV-induced cell death.

      By performing co-immunoprecipitation assays on mock and infected cells that express HAtagged IGF2BP2, the authors propose that the observed change in IGF2BP2 localization results from its recruitment to the replication compartment by the viral NS5 polymerase and associated with the viral RNA. Given that both IGF2BP2 and NS5 are RNA-binding proteins, it is plausible that their interaction is mediated indirectly through the RNA molecule. Notably, the authors do not address the treatment of lysates with RNase before the IP assay, leaving open the possibility of this indirect interaction between IGF2BP2 and NS5.

      We agree with the hypothesis of the reviewer. As suggested, we have performed coimmunoprecipitation assays following RNase A treatment of the cell lysates. As shown in new Fig S6, the abundance of ZIKV NS5 co-immunoprecipitating with IGF2BP2-HA is drastically decreased upon RNase A treatment compared to the untreated condition. This demonstrates that the IGF2BP2/NS5 interaction is mostly RNA-dependent, which is not surprising as RNA is often a structural component of ribonucleoprotein complexes. Of note, the same is observed with ATL2. This new set of data allows us to refine our model of Figure 11 and the discussion as they strongly suggest that the direct binding of IGF2BP2 to viral RNA (evidenced in vitro; Fig 5D) is required for subsequent association with NS5 and ER-shaping protein ATL2. This is in line with the fact that viral RNA is a co-factor in the biogenesis of ER-derived ZIKV vesicle packets (PMID: 32640225). However, we cannot exclude a contribution of cellular RNA in these processes as discussed.   

      In in vitro binding assays, the authors demonstrate that the RNA-recognition motifs of the IGF2BP2 protein specifically bind to the 3' nontranslated region (NTR) of the ZIKV genome, excluding binding to the 5' NTR. However, they cannot rule out the possibility of this host protein associating with other regions of the viral genome. Using a reporter ZIKV subgenomic replicon system in IGF2BP2 knock-down cells, they additionally demonstrate that IGF2BP2 enhances viral genome replication. Despite its proviral function, the authors note that the "overexpression of IGF2BP2 had no impact on total vRNA levels." However, the authors do not delve into a discussion of this latter statement.

      We agree with the reviewer’s comments. We now mention in the discussion that we cannot exclude the possibility that IGF2BP2 associates with RNA motifs within the coding region of the viral genomic RNA, especially considering that it contains N6A-methylated sequences (PMID: 27773535; 27773536; 29373715). Moreover, we discuss the observation that IGF2BP2 overexpression has no impact on vRNA levels (as well as titers). We believe that this is because endogenous IGF2BP2 is highly expressed in cancer cells such as the Huh7.5 and JEG-3 cells used here and is presumably not limiting for viral replication in our system (PMID: 38320625; 35111811; 34309973; 35023719; 37088822; 33224879; 35915142).

      In this study, the authors extend their findings by illustrating that ZIKV infection triggers a remodeling of IGF2BP2 ribonucleoprotein complex. They initially evaluate the impact of ZIKV infection on IGF2BP2's interaction with its endogenous mRNA ligands. Their results reveal that viral infection alters the binding of specific mRNA ligands, yet the physiological consequences of this loss of binding in the cell remain unexplored. 

      We acknowledge that it would be of interest to further study the physiological relevance of the modulation of IGF2BP2 ribo-interactome. Since we have focused here on the role of IGF2BP2 in viral replication, we feel that this will be the focus of future studies notably involving a larger omic-centered approach to identify the most impacted IGF2BP2 mRNA ligands. Of note, Gokhale and colleagues have already reported that CIRBP, TNRC6A and PUM2 proteins regulates the replication of Flaviviridae (PMID: 31810760).

      Additionally, the authors demonstrate that ZIKV infection modifies the IGF2BP2 interactome. Through proteomic assays, they identified 62 altered partners of IGF2BP2 following ZIKV infection, with proteins associated with mRNA splicing and ribosome biogenesis being the most represented. In particular, the authors focused their research on the heightened interaction between IGF2BP2 and Atlastin 2, an ER-shaping protein reported to be involved in flavivirus vesicle packet formation. The validation of this interaction by Western blot assays prompted an analysis of the effect of ZIKV on organelle biogenesis using a newly described replication-independent vesicle packet induction system. Consequently, the authors demonstrate that IGF2BP2 plays a regulatory role in the biogenesis of ZIKV replication organelles.

      Based on these findings and previously published data, the authors propose a model outlining the role of IGF2BP2 in ZIKV infectious cycle, detailing the changes in IGF2BP2 interactions with both cellular and viral proteins and RNAs that occur during viral infection.

      The conclusions drawn in this paper are generally well substantiated by the data.

      We thank the reviewers for this encouraging general comments on our study.

      However, it is worth noting that the majority of infections were conducted at a high MOI for 48 hours, spanning more than one infectious cycle. To enhance the robustness of their findings and mitigate potential cell stress, it would be valuable to observe these effects at shorter time intervals, such as 24 hours post-infection.

      As explained above, IGF2BP2 relocalization to the (dsRNA-enriched) replication compartment was also observed in ZIKV infected cells at one day post-infection.

      Furthermore, the assertion regarding the association of IGF2BP2 with NS5 could be strengthened through additional immunoprecipitation (IP) assays. These assays, performed in the presence of RNAse treatment, would help exclude the possibility of an indirect interaction between IGF2BP2 and NS5 (both RNA-binding proteins) through viral RNA, thus providing more confidence in the observed association.

      See above for our answer and the description of the new data of Fig. S7.

      Reviewer #3 (Public Review):

      Summary:

      The manuscript by Mazeaud and colleagues pursued a small-scale screen of a targeted RNAi library to identify novel players involved in Zika (ZIKV) and dengue (DENV) virus replication. Loss-of-function of IGF2BP2 resulted in reduced titers for ZIKV of the Asian and African lineages in hepatic Huh7.5 cells, but not for either of the four DENV serotypes nor West Nile virus (WNV). The phenotype was further confirmed in two additional cell lines and using a ZIKV reporter virus. In addition, using immunoprecipitation assays the interaction between IGF2BP2 and ZIKV NS5 protein and RNA genome was detected. The work addressed the role of IGF2BP2 in the infected cell combining confocal microscopy imaging, and proteomic analysis. The approach indicated an altered distribution of IGF2BP2 in infected cells and changes in the protein interactome including disrupted association with partner mRNAs and modulation of the abundance of a specific set of protein partners in IGF2BP2 immunoprecipitated ribonucleoprotein (RNP) complexes. Finally, based on the changes in IGF2BP2 interactome and specifically the increment in the abundance of Atlastin 2, the biogenesis of ZIKV replication organelles (vRO) is investigated using a genetic system that allows virus replication-independent assembly of vRO. Electron microscopy showed that knockdown of IGF2BP2 expression reduced the number of cells with vRO.

      Strengths:

      The role of IGF2BP2 as a proviral factor for ZIKV replication is novel. The study follows a logical flow of experiments that altogether support the assembly of a specialized RNP complex containing IGF2BP2 and ZIKV NS5 and RNA genome.

      We thank the reviewer for their positive feedback on our study and its novelty.

      Weaknesses:

      The statistical analysis should clearly indicate the number of biological replicates of experiments to support statistical significance.

      This information has been included in all figure legends.

      The claim that IGF2BP2 knockdown impairs de novo viral organelle biogenesis and viral RNA synthesis is built upon data that show a reduction in RNA synthesis <0.5-fold as assessed using a reporter replicon, thus suggesting a limited impact of the knockdown on RNA replication.

      We agree that a 50% decrease in the replication of our reporter replicon might be considered mild. However, we want to pinpoint that in an infectious set-up, the phenotypes were higher as demonstrated by an 80% decrease in viral particle production even when IGF2BP2 levels were never depleted more that 80% compared to endogenous levels. Moreover, our findings were validated through the analysis of de novo vRO biogenesis by electron microscopy in a replication-independent set-up. Together, these experiments provide compelling evidence for a role for IGF2BP2 in the early stages of viral genome replication.

      Validation of IGF2BP2 partners that are modulated upon ZIKV infection (i.e. virus yield in knocked down cells) can be relevant especially for partners such as Atlastin 2, as the hypothesis of a role for IGF2BP2 RNP in vRO biogenesis is based on the observed increase in the abundance of Atlastin 2 in the RNP complex preciìtated from infected cells.

      First, we would like to emphasize that the proviral role of ATL2 in flavivirus replication, including links to vRO biogenesis, was already reported in two independent studies notably by one of the co-authors (PMID: 31636417; 31534046). Therefore, we have chosen to discuss these previous studies in the manuscript rather than repeating published experiments.  Second, we agree that it would be interesting to further interrogate the role of modulated IGF2BP2 protein partners in ZIKV replication. However, these experiments would constitute a new project per se involving fastidious RNAi-based phenotypic screening and subsequent functional characterization of the identified hits. Therefore, this will be the focus of follow-up studies.  

      Recommendations for the Authors:

      Reviewer #1 (Recommendations For The Authors):

      All IFAs claimed that showing co-localization is minimal, this needs to be addressed.

      We have performed colocalization analyses for relevant images in the revised manuscript (see below and Figs. 4B, 5A, S4A-C and S5A-D. Although this quantification increases confidence in our analysis, we were still cautious in our conclusions, stating that colocalization was partial and that IGF2BP2 accumulates in the replication compartment.

      Western blots and IPs need to be quantified.

      As requested, we have included WB quantification in Figs. 2A, 4A, 4D, 8B-D, S6C and S7D.

      Figure 1: What is the strain background for the ZIKV reporter virus?

      As indicated in the legend of Figure 1E of the primary submission, the Rluc-expressing ZIKV reporter virus (ZIKV-R2A) was based on the FSS13025 isolate (Asian lineage)(PMID: 27198478). To clarify this, we have also indicated the strain background in the main text of the Results and Material & Methods sections.

      Figure 2A: If shGF2BP2 reduces viral titer, the NS3 should show a reduction in 2A, but it doesn't.

      We agree with the reviewer. Although NS3 seems not to be decreased upon IGF2BP2 knockdown in the experiment initially shown in Figure 2A, it should be noted that our homemade rat anti-NS3 antibody is highly sensitive, leading to signal saturation that makes it challenging to distinguish changes in NS3 expression without diluting substantially the lysate sample before the PAGE-SDS. The initial reason for including Fig 2A was not to make a statement about viral protein expression but to validate IGF2BP2 knock-down efficiency. Conclusions about NS3 levels in the initial figure are further complicated by the high MOI of ZIKV was used in Huh7.5 cells which are not quantitative for viral replication measurements. To address this issue, we assessed the impact of IGF2BP2 knockdown on viral protein abundance (as a read-out of overall viral replication) with a lower MOI of ZIKV. The results of the repeat experiment (seen in the new Fig. 2A) show that IGF2BP2 knockdown leads to a decrease in the abundance of NS4A, NS5 and NS3, which is consistent with the titer decrease phenotypes.

      Figure S3: The re-localization claimed is minimal and does not show overlap with NS3. The dsRNA is difficult to see here. Suggest improving the immunofluorescence images and reducing the claim for "strong" co-option of RNP complexes.

      In addition to replication complexes, NS3 labels convoluted membranes which are devoid of dsRNA and IGF2BP2 and surround the cage-like replication compartment as large puncta (PMID: 27545046; 33432690; 28249158). The signal overlap is more obvious between IGF2BP2 and NS3/dsRNA-containing areas, which is reflected by the Mander’s coefficients that have been included in the revised version (Fig. S5C-D). We have also adjusted the text to conclude that the colocalization was partial and that IGF2BP2 accumulated in the replication compartment. We acknowledge that the dsRNA signal is weak, and we have updated the images (and others, when relevant) to better visualize this viral component. Moreover, we have rephrased the sentence to remove the word “strongly”.

      Figure 4A: Western blot needs quantification.

      This is now included in the figure.

      Figure 4B: As in many of the IFAs, the co-localization is only partial. Additionally, the dsRNA is not visible. So the images need to be improved. The colocalization should be quantified across the cell diameter.

      We changed the color and intensity of the dsRNA staining to make it more visible. Mander’s colocalization coefficients have been determined and included in Figures 4B and S5C-D.

      Figure 4C: It is difficult to understand what the +/- is on the blots for the cell extracts and the anti-HA IP samples. It is not described in the figure legend or the text.

      As already indicated on the right of the panel, the +/- indicates whether or not IGF2BP2-HA was overexpressed in the cells. In the revised version, this is clarified in the figure legend.

      Figure 5A: Once again similar to other IFAs, the co-localization is only minimal and thus difficult to claim as "co-localization" is actually happening. It would be good to either improve the images or discuss this observation in the text and reduce the claim of colocalization. Specifically, since the two proteins might be co-localizing in specific regions which would make it a very interesting observation. Also, quantification of co-localizing regions would be beneficial.

      We have included the requested colocalization analysis. We have been cautious to indicate that colocalization was only partial. It is noteworthy that, despite many efforts in the optimization of the cell permeabilization procedure, we noticed that the FISH probes were not very efficient in accessing the perinuclear area of the infected cells, where replication complexes accumulate. In that respect, it is likely that this imaging approach “miss” some of the IGF2BP2/vRNA complexes and that the determined colocalization factor is underestimated. This explains why the confirmation of the vRNA/IGF2BP2 complex with a biochemical approach (Fig. 5B) was very relevant.

      Figure 5D: It is unclear what the blue squares represent. Clearer figure legends and text would be beneficial.

      As stated in the initial figure, the blue squares indicate values obtained with the ZIKV 5’ UTR probe while the green circles involve a 3’ UTR probe. We have further emphasized this information in the figure legend to make it clearer.

      Figure 6B. The graph is missing the data and X-axis label for shIGF2BP2.

      We had initially omitted the values of the conditions with shIGF2BP2 and the replicationdead GAA replicon, since this viral system does not allow accumulation of viral genomes or proteins and was not relevant at the 48h time point. We thought that the inclusion of the shNT/GAA condition was enough an internal negative control of viral replication since values for shIGF2BP2/GAA did not exceed background. Nevertheless, we have now included this condition in the revised figure.

      Figure 7D: It is unclear what the -/+ signs are in the cell extracts and the IP blots. Specifically, since there is an NS5 signal in the (-) lanes.

      As explained above, the +/- indicates whether IGF2BP2-HA was overexpressed. The meaning of these symbols is now further clarified in the figure legend.

      Figure 8C: The circles with the different colors are not clearly described. What does it mean?

      As indicated in the figure (left part), the red and green circles identify the partners of the STRING network whose association with IGF2BP2 is decreased and increased during infection, respectively. We have included this information in the figure legend.

      Figure 9: The electron microscopy to quantify vesicles should be carried out using whole-cell tomography in order to get the most accurate quantification of the vesicles following different treatments. This is because if you only look at one cell profile (slice), the number of vesicles might be less in that profile and more in another below or above it. It is unclear how many cell profiles were used for the quantification and how the calculations were carried out.

      We agree with the reviewer that ideally, one should perform 3D electron tomography to precisely assess the morphology of VPs. Regardless the fact that we do not possess the imaging infrastructure to perform that type of analysis, such an approach would represent a tremendous amount of work if one would like to process at least 200-400 vesicles from > 50 cells and their whole cytoplasm (as we did). Despite not having 3D images, this number of data points is sufficient to see general changes in viral replication vesicle morphology, especially considering that Huh7-Lunet cells are relatively flat cells. (PMID: 32640225; 36700643; 34696522; 31636417). Furthermore, since IGF2BP2 knockdown decreases the abundance of VPs and does not impact their diameter, we believe that the addition of sophisticated 3D analysis would not bring any new and relevant information and that the TEM data stand by themselves for the conclusion we made. A more refined morphological analysis to determine how IGF2BP2 is structurally involved in virus-mediated membrane reorganization could be the focus of a future study.

      We feel that we have already provided sufficient information about the quantification in the Material & Methods section of the first version of the manuscript: “Quantification was performed by systematically surveying cells and evaluating the presence of VPs. Only cells with >2 VPs were considered as positive. For each condition, >50 cells were surveyed over 4 biological replicas. All observed VPs were imaged, and VP diameters were determined using ImageJ by measuring the distance across two axes and averaging”.

      Reviewer #2 (Recommendations For The Authors):

      The inclusion of a control in the knock-down and infection assays with the reporter virus could enhance the validity of the findings. Introducing STAT2 knockdown, a recognized antiviral protein for ZIKV, as a control would provide a valuable benchmark to evaluate the extent of viral enhancement in the experiments. This additional control not only supports the proposed function of LARP1 in virus assembly/release but also strengthens the overall interpretation of the results.

      We agree that adding a positive control could have been relevant for assessing the extent of replication modulation, especially for increases such as that observed with shLARP1. However, finding such control proteins in our system was a challenge. Indeed, STAT2 would not have been a good control for these experiments since we used Huh7.5 cells for the RNAi mini-screening, which do not express a functional RIG-I protein, and generally do not produce type I and III interferons. Thus, STAT2 knockdown is not expected to result in an increase in replication. That said, we feel that it was unnecessary to include a control for replication inhibition here given that only a few statistically reliable candidates we obtained. Instead, we have opted for an extensive secondary validation approach by assessing the proviral role of IGF2BP2 for multiple viruses - DENV1-2-3-4, WNV and SARS-CoV-2, and 3 ZIKV strains in three relevant cell types.

      Additionally, in Figure S4, the authors employ an antibody against NS5 that specifically recognizes ZIKV NS5 but not DENV NS5. Given the objective of highlighting distinctions between these two viruses, it is advisable to use an antibody that detects DENV NS5 as well. This approach would contribute to a more comprehensive comparison, ensuring a balanced representation of both viruses in the experimental analysis.

      We thank the reviewer for this relevant suggestion. We have repeated the coimmunoprecipitation assays using antibodies specific to DENV NS5 (Aithor response image 1). While we specifically pulled down ZIKV NS5 with IGF2BP2-HA as expected, this was not the case for DENV NS5 when using extracts from DENV-infected cells despite our multiple attempts. Indeed, the amount of pulled-down DENV NS5 with IGF2BP2-HA was always comparable to that in the negative control (“empty” pWPI lentivirus-transduced cells, “-“ condition), which corresponds to non-specific binding to the HA-resin. Thus, while the antibody was very efficient at detecting DENV NS5 in the cell extracts, no specific binding between DENV NS5 and IGF2BP2-HA could be evidenced. Consistent with our different replication phenotypes between DENV and ZIKV, this strongly supports that the NS5/IGF2BP2 interaction is specific to ZIKV. The specificity of the IGF2BP2 interaction with ZIKV NS5 compared to DENV NS5 is discussed in the updated manuscript.

      Author response image 1.

      DENV NS5 is not specifically co-immunoprecipitated with IGF2BP2-HA in contrast to ZIKV NS5. Huh7.5 cells stably expressing IGF2BP2-HA (+) and control cells (-) were infected with ZIKV H/PF/2013 at a MOI of 10 or left uninfected. Two days later, cell extracts were prepared and subjected to RNase A treatment (+) or not (-) before anti-HA immunoprecipitations. The resulting complexes were analyzed by western blotting for their abundance in the indicated proteins.

      Reviewer #3 (Recommendations For The Authors):

      (1) Statistical analysis. Please clearly indicate what columns and error bars represent for bar graphs such as those presented in Figures 1A-D and F, Figures 2B-C, and bottom panels in DE, Figure 3, Figure 5B, Figure 6B-C, and Figures 9B-D and F. For instance, the mean of n independent experiments and standard deviation.

      Information about the number of replicates, error bars, and statistical tests has been added for all figures in the legends. 

      (2) What is the scale in the Y-axis of Figure 2C? As shown, it is difficult to know what is the virus titer in knocked-down cells. Please use a linear scale or a log scale.

      This is a linear scale of viral titers, which we have modified to make it clearer for the reader.

      (3) Throughout the manuscript (e.g. Figures 1, 2, and 3) the fold reduction in titer is presented instead of the actual virus titers. I suggest showing the titer as it may be much more informative for the reader.

      We prefer showing the data as fold reduction as they better reflect the IGF2BP2 knockdowninduced phenotypes across the independent biological replicates. Indeed, from one experiment to another, the reference titers in the control condition sometimes varies because of the cell passage or the lentiviral transduction efficiency for instance, especially when low multiplicities of infection are used. However, the reduction phenotype in foldchange observed upon IGF2BP2 knockdown was always consistent regardless of the titer value.  Of note, all considered experiments had reference titers above 105 PFU/mL.

      (4) Is it possible to perform a colocalization analysis of confocal images showing overlapping signals?

      This has been done and the results of these analyses are included in the updated figures 4B, 5A, S4 and S5.

      (5)  Assessing the effect of Atlastin2 knockdown in virus yield and showing coimmunoprecipitation of Atlastin 2 with NS5 can add relevant information.

      As mentioned in the discussion and above, ATL2 was already reported to be required for DENV and ZIKV replication in two independent studies (including one by one of the coauthors)(PMID: 31636417; 31534046). We have not tested whether ATL2 associates with NS5. However, new Fig. S7 of the revised manuscript shows that IGF2BP2/ATL2 is RNAdependent. This suggests that, as initially depicted in our model, IGF2BP2 associates with the ER (and thus, ATL2) after its binding to the viral RNA. Further interrogation into the role of atlastins in the flavivirus replication cycle is the focus of another ongoing IGF2BP2-unrelated study from one of the co-authors which will be reported elsewhere.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      The manuscript reports useful findings by resolving the crystal structure of Sedoheptulose-1,7-Bisphosphatase (SBPase) from the green algae Chlamydomonas reinhardtii, which is involved in the Calvin cycle. The data presented are solid based on validated methodologies, which help in understanding the structure and function of this enzyme.

      We thank the editors for this positive assessment.

      Public Reviews:

      Reviewer #1 (Public Review):

      In this study, Le Moigne and coworkers shed light on the structural details of the Sedoheptulose-1,7-Bisphosphatase (SBPase) from the green algae Chlamydomonas reinhardtii. The SBPase is part of the Calvin cycle and catalyzes the dephosphorylation of sedoheptulose-1,7-bisphosphate (SBP), which is a crucial step in the regeneration of ribulose-1,5-bisphosphate (RuBP), the substrate for Rubisco. The authors determine the crystal structure of the CrSBPase in an oxidized state. Based on this structure, potential active site residues and sites of post-translational modifications are identified. Furthermore, the authors determine the CrSBPase structure in a reduced state revealing the disruption of a disulfide bond in close proximity to the dimer interface. The authors then use molecular dynamics (MD) to gain insights into the redox-controlled dynamics of the CrSBPase and investigate the oligomerization of the protein using small-angle X-ray scattering (SAXS) and size-exclusion chromatography. Despite the difference in oligomerization, disruption of this disulfide bond did not impact the activity of CrSBPase, suggesting additional thiol-dependent regulatory mechanisms modulating the activity of the CrSBPase.

      We thank reviewer 1 for his/her careful reading of our manuscript.

      The authors provide interesting new findings on a redox-mechanism that modulates the oligomeric behavior of the SBPase, however without investigating this potential mechanism in more detail. The conclusions of this manuscript are mostly supported by the data, but they should be more carefully evaluated in respect to what is known from other systems as e.g. the moss Physcomitrella patens. This is especially of interest, as SBPase was previously reported to be dimeric, whereas for FBPase a dimer/tetramer equilibrium has been observed.

      We thank reviewer 1 for his/her comments on the novel or confirmatory character of our structure-function analysis onCrSBPase. We address the questions of oligomeric states later in this response.

      (1) Given that PpSBPase has been already characterized in detail, the authors should provide a more rigorous comparison to the existing data on SBPases. This includes a more conclusive structural comparison but also the enzymatic assays should be compared to the findings from P. patens. Do the authors observe differences between the moss and the chlorophyte systems, maybe even in regard to the oligomerization of the SBPase?

      Indeed, a previous study conducted by one of the authors of the current manuscript (Stéphane D. Lemaire) and collaborators determined the structure and regulatory properties of SBPase from the moss Physcomitrella patens (Gütle et al. 2018 https://doi.org/10.1073/pnas.1606241113). We added a clearer reference to this earlier work. The differences that we observed regarding the oligomeric states of SBPase from Chlamydomonas reinhardtii principally stem from our analytical method in vitro through size-exclusion chromatography, in comparison with crystal packing analysis in the reference study. We detailed PpSBPase/CrSBPase oligoimeric state comparison in the paragraph 'Oligomeric states of CrSBPase'. Besides, the asymmetric unit of our CrSBPase crystal structure is also a homodimer, similarly to PpSBPase, and we suggest that PpSBPase is also likely to adopt several oligomeric states in vitro. If this were confirmed by experiments, SBPase in several organisms would behave analogously to FBPase regarding the dimer/tetramer equilibrium.

      In paragraph 'Crystal structure of CrSBPase' we added a comparison by alignment of our CrSBPase crystal structure to the previously reported _Pp_SBPase crystal structure, stating that with RMSD=0.478 Å the proteins are essentially identical.

      In paragraph 'CrSBPase enzymatic activity' we compared the value we obtained for enzyme specific activity to those previously published on other SBPase from Chlamydomonas or the land plant Spinacia oleracea, highlighting the similarity of results in three different systems and teams (Seuter et al. 2002 https://doi.org/10.1023/A:1019297521424 and Tamoi et al. 2005 DOI: 10.1271/bbb.69.848).

      (2) The authors should include the control experiments (untreated SBPase) and the assays performed with mutant versions of the SBPase, which are currently only mentioned in the text or not shown at all.

      We add supplementary figure 14 in order to illustrate that since SBPase C115S or C120S mutants are still activated by reducing agent, the disulfide bridge between cysteines 115 and 120 is not the single control over SBPase activity but rather a control over the oligomeric exchange of the enzyme indirectly contributing to redox activation of the active site.

      (3) The representation of the structure in figures (especially Figures 1 and 3) should be adjusted to match the author's statements. In Figure 1, the angle from which the structure is displayed changes over the entire figure making it difficult to follow especially as a non-structural biologist. Furthermore, important aspects of the structure mentioned in the text are not labeled and should be highlighted, by e.g. a close-up. Same holds true for Figure 3 that currently mostly shows redundant information.

      We thank reviewer 1 for his/her advise on how to improve Figure 1. We drew new images for the complete figure, hopefully providing more consistent and clearer visual support to our text. For simplicity, protein is now always represented centered around its active site in the same orientation. We represent co-crystallized water in all projections as a guide to the eye.

      Figure 3 and supplementary figure 3 were switched in order to better represent the experimental evidence provided by the resolution of SBPase structure under reducing conditions, i.e., the increase in local disorder around C115-C120 pair of cysteines in the 113-130 stretch forming a redox-conditionally dynamic loop and β-hairpin motif.

      (4) The authors state that mutation of C115 and C120 to serine destabilize the dimer formation, while more tetramer and monomer is formed. As the tetramer is essentially a dimer of dimers, the authors should elaborate how this might work mechanistically. In my opinion, dimer formation is a prerequisite for tetramer formation and the two mutations rather stabilize the tetramer instead of destabilizing the dimer.

      Time-dependent dynamic character of SBPase oligomer exchange is not resolved by the current study because we essentially combined size-exclusion chromatography (SEC) and X-ray crystallography to define quaternary structures at equilibrium. Overall, homodimer is the dominant state of wild-type SBPase by abundance in the purified recombinant form and by forming the constitutive asymmetric unit in all crystal packings. Dimer is indeed present in the tetramer state, a dimer of dimers, as pertinently stated by reviewer 1.

      This being recognized, we tried to explain the systematic co-elution of the principal dimeric form with an additional species of smaller size on SEC (supplementary figure 1, right-side shoulder of the peak), at the apparent mass of a monomer. When solving the crystal structures of SBPase we realized that the dimer interface is contributed by residues 113-130 forming a loop and β-hairpin motif. Notably, in this loop cysteine 115 (C115) maps at bonding distance of 3.9 Å of side chain of arginine 220 (R220) from dimer partner subunit. In loop 113-120, cysteine pair C115 and C120 are subject to redox switching between disulfide (closed) and dithiol (open) conformations, as shown in our structures 7B2O and 7ZUV, respectively. Given that the reduction of C115-C120 disulfide bridge correlates with a higher flexibility of this motif that contributes to dimer interface (figure S3), we hypothesized that reduction of SBPase would destabilize dimer state to the benefit of transitory monomer state, and indeed point mutagenesis of C115S or C120S caused a large modification of oligomer equilibrium in favour of the monomer (figure S1C).

      Mechanistically, we suggest two scenarios for the tetramer formation: either monomers first interact as in the crystallographic dimer before pairing such dimers into tetramers (as proposed by reviewer 1), or monomers start tetramerization by favoring the alternative subunit interface (figure 5B, between cyan and magenta chains) before stabilizing the crystallographic homodimer interface. In this latter case, monomerization would be necessary to efficiently re-arrange SBPase dimers into tetramers.

      In physiological conditions the re-arrangement switch would be controlled by C115-C120 reduction through ferredoxin-thioredoxin redox cascade. Structural studies in dynamic conditions like native mass spectroscopy/photometry would be necessary to solve this speculation unambiguously although at this stage of our investigation there seem little doubt to us that C115-C120 disulfide-dithiol exchange is essential to control a dimer/monomer balance in first instance.

      Reviewer #2 (Public Review):

      The central theme of the manuscript is to report on the structure of SBPase - an enzyme central to the photosynthetic Calvin-Benson-Bassham cycle. The authors claim that the structure is first of its kind from a chlorophyte Chlamydomonas reinhardtii, a model unicellular green microalga. The authors use a number of methods like protein expression, purification, enzymatic assays, SAXS, molecular dynamics simulations and xray crystallography to resolve a 3.09 A crystal structure of the oxidized and partially reduced state. The results are supported by the claims made in the manuscript. One of the main weakness of the work is the lack of wider discussion presented in the manuscript. While the structure is the first from a chlorophyte, it is not unique. Several structures of SBPase are available. As the manuscript currently reads, the wider context of SBPase structures available and comparisons between them is missing from the manuscript. Another important point is that the reported structure of crSBPase is 0.453A away from the alphafold model. Though fleetingly mentioned in the methods section, it should be discussed to place it in the wider context.

      We thank reviewer 2 for his/her assessment of our manuscript. In response to his/her suggestion to better compare our SBPase structure from the model microalga Chlamydomonas reinhardtii to that of the ortholog from Physcomitrium patens previously reported by an author of this manuscript (Stéphane D. Lemaire) and collaborators (Gütle et al. 2018), we wish to point out that paragraph 3 of the introduction was dedicated to this reference along with a mention to related Thermosynechococcus elongatus dual function fructose-1,6-bisphosphatase sedoheptulose-1,7-bisphosphatase (F/SBPase). We nevertheless follow his/her suggestion to better detail comparison between chloroplastic SBPase structures in the first result section 'Crystal structure of CrSBPase', consistently with response 1 to reviewer 1 (see above).

      Regarding the integration of AlphaFold (AF) computational models in a general discussion about SBPase molecular structure, we wish to point out that our initial 7B2O crystallographic model of CrSBPase was deposited in PDB on 2020-11-27 before AlphaFold2 was available for the scientific community (Jumper et al. publication date is 15 July 2021).

      AF2 entry AF-P46284-F1-model_v4 from AlphaFold Protein Structure Database aligns with our crystal structure 7B2O chain E with RMSD = 0.434 Å, showing excellent agreement between experiment and prediction at the level of protein main chain. It must still be pointed out that it is the AF2 model which is at 0.434 Å away from the experiment, and not the opposite. Exceptions of alignments are in local differences in several loops conformations and in the length of secondary structure elements. Many amino acid residues side chains adopt distinct orientations between the computational model and the experimental structure.

      AF3 was recently communicated (Abramson et al. 2024) along with its online prediction server hosted at https://golgi.sandbox.google.com. CrSBPase model from AF3 align to our crystal structure 7B2O chain A with RMSD = 0.489 Å showing again their strong similarity and with a smaller discrepancy between AF2 and AF3 of RMSD = 0.216 Å. The only significant deviations between 7B2O and AF3 are in the orientation of several side chains and notably on the conformation of region 114-131 that contain the redox sensor motif.

      We added the last two paragraphs to the revised version of the manuscript, after the results section presenting our crystallographic work.

      Recommendations for the authors:

      We made all recommended modifications as detail below.

      Reviewer #1 (Recommendations For The Authors):

      I have outlined a number of minor points below.

      We addressed all minor points listed.

      Line 220: The asymmetric unit only contains three dimers. The dimer of dimer or tetramer can only be reconstituted by displaying the symmetry mates.

      We corrected our sentence for 'The asymmetric unit is composed of six polypeptide chains packing as three dimers'.

      I also suggest that the authors separate the description of the asymmetric unit content from the modeled water molecules and rephrase e.g. „..and four water molecules could be modeled."

      We rephrased as suggested.

      I appreciate that the authors uploaded the structure in advance of this article, which allowed to evaluate the quality of the structure. Although this does not add valuable information, I have identified several unmodeled blobs, which possibly also account for waters.

      Unmodeled blobs were tentatively assigned to water but had to be removed during later refinements. We used Coot Validate tools 'Unmodelled blobs' and 'Check/Delete water' to progress towards the current optimal refinement statistics. We admit that the resolution of the crystallographic dataset (3.09 Å) is limiting to reliably model mobile or less resolved elements like water molecules. Overall, we estimate that the functional elements of the structure are modeled to the best of our knowledge and with minimal subjectivity.

      Line 222: Please write 309 instead of spelling the number.

      We corrected for 309 instead of spelling the number.

      Line 223: The structure representation in Figure 1A/B has to be improved. The authors might consider labeling the two domains & color them in two colors instead of the rainbow color coding. Furthermore, the 90{degree sign} rotation does not add much information. Here, turning the model in a different direction that allows to see the central b-sheet of domain 2 might be better suited. Furthermore, instead of describing b-strands first, followed by a-helices, I suggest describing which secondary structure elements form the two domains.

      We improved Figure 1A as suggested while keeping Figure 2B with 90° rotation as rainbow color gradient in order to display with clarity the secondary structure content and connectivity. The orientation was tilted to better display the central β-sheet. This new version of Figure 1A/B should facilitate the text description of SBPase architecture that we amended as suggested.

      Line 229: The information on A113-120 should be depicted in a closeup in Figure 1A.

      We made a close-up view of sequence 113-120 as added figures 1C-D and modified the rest of the figure and legend accordingly.

      Line 234: Please provide an r.m.s.d here.

      We now provide r.m.s.d. for all structural alignments.

      Line 242: Please introduce the domain labeling in Fig 1C to make it easier to track the exact region within SBP here. Is the residue numbering according to SBP or the human FBP?

      Modified version of figure 1 now shows SBPase in the same orientation for panels A, E, F, G, H for simplicity. Domains labeling is indicated in panel A with NTD/CTD distinct colors as suggested. We explicited the position of W401 on all panels as a guide to the eye. We indicated in figure legend that residue numbering is according to Chlamydomonas SBPase Uniprot entry P46284.

      Line 244: Is Figure 1D in the same orientation as C? I suggest making the surface transparent and showing the cartoon below, which will allow to easier see the solvent accessibility of the residues. Also, clearly label W401 (although it's the only water shown/modeled in this region).

      We modified figure 1 to show all equivalent panels (ie. A-E-F-G-H) with the same orientation. In this new form we think that solvent accessibility and the relative position of significant residues is easier to interpret for the reader. W401 is consistently labeled throughout figure 1 panels.

      Line 263: Please provide a close-up of the C222 and C231 including measured distance. It's clearly not visible from this view. It might even be helpful to provide close-ups of all cysteine residues that are mentioned in the text.

      In the modified version of figure 1 we estimate that C222 and C231 are more easily visible. We added a close-up view of C22-C231 environment in a new supplementary figure 2. Since we do not explore further the functional relevance of this redox pair we chose not include C222-C231 close-up view in main figure 1. We added legends and modified supplementary figures numbering accordingly.

      Line 276: As already mentioned earlier, none of the panels in Figure 1 provide a close-up of this loop. This should be added.

      This loop is now displayed as a close-up view in panels C and D of main figure 1.

      Line 284: It is difficult to follow the relative positions of the potential modification sites if the model is always depicted from a different angle in Figure 1. The authors might want to change this across Figure 1 or show the rotation angle.

      This problem was addressed in the revised figure 1, panels A-E-F-G-H are in the same orientation now. Panel B was kept at a rotation of 90° with corresponding annotation.

      Line 290: Please label W401. Also stick to one nomenclature (W or H20).

      We labeled W401 and kept nomenclature consistent throughout the manuscript.

      For comparative reasons, a full kinetic measurement (determination of Km and kcat) of the SBPase would also be helpful here.

      We resolved to avoid a full kinetic measurement of CrSBPase because we could neither identify a reliable chemical provider nor synthesize ourselves the physiological substrate sedoheptulose-1,7-bisphosphate (SBP) and only characterized the reaction with fructose-1,6-bisphosphate. However, in the revised form of the manuscript we added in main text paragraph 'CrSBPase enzymatic activity' the kinetic constants from the previous reference study conducted on spinach SBPase (Cadet and Meunier, Biochem. J. 1988) with KMSBP\=0.05 mM and kcatSBP\=81 sec-1 of fully active enzyme with SBP as a substrate. For comparison, the authors of this study report that activity of SBPase on FBP is in the same range but lower, with KMFBP\=0.38 mM and kcatFBP\=21 sec-1. We also added a comparison of specific activities of our CrSBPase and spinach SBPase in the main text, showing that our enzyme behaves as previously reported ortholog from land plant.

      Line 303: How much MgSO4 was used for the experiment shown in Figure 2A?

      10 mM of MgS04 was used for experiment shown in Figure 2A. We added this information in the figure legend. We also added in the legend that 10 mM DTT is present in the experiment of Figure 2B and that 10 mM of MgSO4 and 1 mM of DTT are present in the experiment of Figure 2C.

      Line 321: In my opinion it is not necessary to show the regions of all molecules here. I was rather expecting a superposition of the two structures (oxidized and reduced) with a close-up of the respective disulfide in the two states.

      We agree that the initial version of Figure 3 panels showing side-by-side all conformational variants of the redox motif appear redundant. We switched initial Figure 3 to supplementary data and replaced it with the crystallographic b-factor mapping of the redox motif, in the variable conditions resolved by the crystals. We would like to stress that all these conformations were experimentally determined through X-ray crystallography, whether of the crystal of pure inactive enzyme that proved to be oxidized on the redox motif, or of the equivalent crystals submitted to activating treatment by the chemical reductant TCEP. As an attempt to clarification we added visual boxes to better appreciate this reduction-induced conformational plasticity that we interpreted as a local conditional disorder.

      Line 331: Could the authors provide movies of the MD simulation? Otherwise, interpretation of the MD simulation results might be difficult for non-experts.

      We added two movies of 20-µsec MD simulations as supplementary data to help non-expert readers.

      Line 343: It might be helpful to label the structure elements in Figure 4 accordingly (e.g. residues, etc.)

      We added secondary structure labeling in Figure 4.

      Line 381: Should be changed to Figure 5A.

      We changed reference to figure 6 that is a renumbering of figure 5 with changes included from suggestions below. Figure 6 now includes chromatograms of recombinant SBPase in panel A and chromatogram and western blot analysis of Chlamydomonas extracts in panel B.

      Line 383: See above, figure 5B. Which structure is shown in the figure? 7zuv or 7b2o? Maybe include both structures in the figure in a side-by-side view. The authors might also want to include the SEC chromatograms in the main figure. Especially the purification from Chlamydomonas is helpful to estimate whether post-translational modifications have an impact on the oligomerization. This should also be mentioned in the text.

      7b2o and 7zuv are illustrated side-by-side in panels A and B of figure 5. This was indicated in the figure legend, we now added the information on the figure. As suggested above we included chromatograms initially presented as supplementary material in a new main figure 6, panel A for recombinant proteins and panel B for proteins extracted from Chlamydomonas. Initial figures 5D-E, showing surface conservation of the dimeric SBPase, is moved to supplementary figure 5.

      Line 385: I don't find the cultivation of Chlamydomonas in the method section. It should be added.

      We added a methods paragraph dedicated to « Cultivation of Chlamydomonas for native SBPase analysis ».

      Line 390-392: This information is not really helpful. Concentrated purified proteins might precipitate after a week storage without physiologically relevant effects being the reason.

      We agree that the observation of a precipitate building up in vitro after a week of storage bears no particular physiological implications. We rather intended to report that an aggregated form of purified protein can be turned to droplets under the redox conditions that activate the enzyme. We reformulated these lines for clarification.

      Line 397: I would appreciate having the SEC-chromatograms of the mutants also in the main figure.

      Size-exclusion chromatograms that were initially in supplementary figures are now shown in main text figure 6 panel A, with the profiles WT and mutants aligned.

      Line 402: Where are these data shown? They should be included in Figure 5.

      We added a figure to present these data, not shown in the initial version of the manuscript. We preferred to place it as supplementary material because C115S and C120S mutant catalytic activity is essentially the same as WT and do not reveal a direct mechanistic effect of C115-C120 reduction over the catalytic pocket.

      Line 427: Did the authors look into a possible cooperativity of their SBPase?

      We did not observe direct positive cooperativity that could be ascribed to allostery in our enzymatic assays. It was previously reported for spinach SBPase that SBP saturation functions were hyperbolic with no evidence of homotropic interactions in the enzyme oligomer (Cadet and Meunier Biochem J. 1988 253, 249-254). The authors of this kinetic study however present a clear sigmoid response of SBPase to Mg2+ concentration, suggestive of an activating cross-talk between active sites in the oligomer. We consider this hypothesis of interest and wish we could further investigate allosteric conformational changes when SBP physiological substrate would be available.

      Line 428-434: I don't really understand how the proteome mapping fits in here. Do the authors speculate that SBPase is recruited by some of the identified enzymes or directly interacts with them or that rather the spatial distribution optimizes the reaction kinetics?

      We indeed want to correlate our in vitro observations of CrSBPase conditions of activity to those recently published by the group of Dr. Martin Jonikas in a physiological, in vivo setup of Chlamydomonas reinhardtii (Wang, Patena et al. Cell 2023 186, 3499–3518). We have no experimental evidence demonstrating the first suggestion that SBPase is recruited or directly interacts with partner enzymes but we privilege the second suggestion that local spatial distribution in the chloroplast stroma optimizes enzyme reaction kinetic thanks to Calvin-Benson-Bassham enzymes proximity. We rephrased these lines to clarify our hypothesis and express its speculative character.

      Reviewer #2 (Recommendations For The Authors):

      To make the manuscript stronger, the authors are recommended to do the following:

      We followed given recommendations.

      (1) include a wider discussion on the other SBPase structures that are available. A detailed comparison should be made between the oxidized and reduced structures present in the PDB with the structures that are being reported in the manuscript.

      Consistently with reviewer #1 suggestion, and as detailed in response to public review above, we followed the recommendation to better report previous structural studies of SBPase in the results section. We also added comparisons with computational models from AlphaFold2 and AlphaFold3.

      (2) The authors mention co-operativity between the subunits. With excellent sampling from molecular dynamics simulations, the authors should demonstrate co-operativity between the subunits.

      Our molecular dynamic (MD) simulations span 20 µsec of SBPase in the dimeric state, starting from the experimental structures determined by XRC. In the considered time window, the only significant events that we observed are the local reorganization of the LBH motif that is a prerequisite for dimer rearrangement. We infer that local disorder contributes a separation of the pair of subunits in order to later allow for the building of the active homotetramer, at longer time scales that are outside the capacities used in this work. Moreover, demonstrating cooperativity with MD simulations would require more than a single event to ensure that results are significant, and performing series of 20µs-MD of SBPase is also outside the available capacities.

    2. eLife assessment

      The manuscript reports valuable findings by resolving the crystal structure of Sedoheptulose-1,7-Bisphosphatase (SBPase) from the green algae Chlamydomonas reinhardtii, which is involved in the Calvin cycle. The data presented are solid, based on validated methodologies, and they help in understanding the structure and function of this enzyme.

    3. Reviewer #1 (Public Review):

      In this study, Le Moigne and coworkers shed light on the structural details of the Sedoheptulose-1,7-Bisphosphatase (SBPase) from the green algae Chlamydomonas reinhardtii. The SBPase is part of the Calvin cycle and catalyzes the dephosphorylation of sedoheptulose-1,7-bisphosphate (SBP), which is a crucial step in the regeneration of ribulose-1,5-bisphosphate (RuBP), the substrate for Rubisco. The authors determine the crystal structure of the CrSBPase in an untreated, oxidized state. Based on this structure, potential active site residues and sites of post-translational modifications are identified. Furthermore, the authors determine the CrSBPase structure in a reduced state revealing the disruption of a disulfide bond in close proximity to the dimer interface. The authors then use molecular dynamics (MD) to gain insights into the redox-controlled dynamics of the CrSBPase and investigate the oligomerization of the protein using small-angle X-ray scattering (SAXS) and size-exclusion chromatography. Despite the difference in oligomerization, disruption of this disulfide bond did not impact the activity of CrSBPase, suggesting additional thiol-dependent regulatory mechanisms modulating the activity of the CrSBPase.

      The authors provide interesting new findings on a redox-mechanism that modulates the oligomeric behavior of the SBPase. Comparisons of the Chlamydomonas structure to the previously determined SBPase structure from the moss Physcomitrium patens confirm a high structural similarity between the two proteins suggesting that this mechanism might be evolutionary conserved. Future research will have to address this question experimentally, also considering potential cooperativity between the subunits to confirm the link between oligomerization and SBPase activity.

    4. Reviewer #2 (Public Review):

      The central theme of the manuscript is to report on the structure of SBPase - an enzyme central to the photosynthetic Calvin-Benson-Bassham cycle. The authors claim that the structure is the first of its kind from a chlorophyte Chlamydomonas reinhardtii, a model unicellular green microalga. The authors use a number of methods like protein expression, purification, enzymatic assays, SAXS, molecular dynamics simulations and x-ray crystallography to resolve a 3.09 A crystal structure of the oxidized and partially reduced state. The results are supported by the claims made in the manuscript. While the structure is the first from a chlorophyte, it is not unique. Several structures of SBPase are available and a comparison has been made between the structure reported here and others that have been previously published.

    1. eLife assessment

      This study provides a valuable strategy for treating mouse cutaneous squamous cell carcinoma (mCSCC) with serum derived from mCSCC-exposed mice. The exploration of serum-derived antibodies as a potential therapy for curing cancer is particularly promising but the study provides incomplete evidence for specific effects of mCSCC-binding serum antibodies. This study will be of interest to scientists seeking a novel immunotherapeutic strategy in cancer therapy.

    2. Combined Public Reviews:

      Summary:

      This study presents an immunotherapeutic strategy for treating mouse cutaneous squamous cell carcinoma (mCSCC) using a passive immunity-like strategy. The researcher induced tumors in healthy mice skin, then isolated the tumor cells and injected into other healthy mice to produce anti-tumor antibodies, and then administered these antibodies back into tumor-bearing mice. Results showed a reduction in tumor volume and altered expression of several cancer markers (p53, Bcl-xL, NF-κB, Bax). The analysis of results suggests a promising impact of antibody-rich serum in treating mouse cutaneous squamous cell carcinoma (mCSCC).

      Strengths:

      The approach does seem to have effect on preventing tumor progression, from both the tumor size and the cancer hallmarks expression level.

      Weaknesses:

      Despite the strength of the study, there are a few drawbacks in the study design and statistical analysis:

      (1) Regarding the statistical analysis, the use of a paired t-test might be suboptimal for assessing the trend from weeks 15 to 17. It is recommended to consider alternative methods such as repeated measures ANOVA or linear regression to better capture and interpret the trend over this time period.

      (2) To affirm the antibodies' role in the observed immune response, isolating antibodies rather than employing whole serum could provide more conclusive evidence. Comparative analyses with antibody-free serum or serum from healthy, non-immunized mice would clarify antibodies' specific contributions versus other serum components. The control group does not account for the potential immunostimulatory effects of serum injection itself. A better control would be tumor-bearing mice receiving serum from healthy non-mCSCC-exposed mice.

      Response to author's rebuttal:

      I acknowledge the value of evaluating serum therapy as a whole, considering the complex interactive networks and potential synergies involved. However, to scientifically understand and assess serum therapy, it remains essential to decompose the serum and identify the effective components. This decomposition would allow for a comparison of individual components with the overall effectiveness, thereby elucidating any synergistic effects.<br /> While I agree that identifying specific epitopes and paratopes is indeed challenging and may exceed the scope of academic research, the use of methods such as Protein A purification or other techniques to isolate antibodies and cytokines from the serum is both necessary and feasible. This approach would enable a more detailed analysis of the individual effects of these components. I understand that the authors might not have that much resource, and I acknowledge this limitation. Nonetheless, other than this aspect, I believe the authors have adequately addressed my other concerns.

    3. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      This study provides a useful strategy for treating mouse cutaneous squamous cell carcinoma (mCSCC) with serum derived from mCSCC-exposed mice. The exploration of serum-derived antibodies as a potential therapy for curing cancer is particularly promising but the study provides inadequate evidence for specific effects of mCSCC-binding serum antibodies. This study will be of interest to scientists seeking a novel immunotherapic strategy in cancer therapy.

      Joint Public Review:

      Summary:

      This study presents an immunotherapeutic strategy for treating mouse cutaneous squamous cell carcinoma (mCSCC) using serum from mice inoculated with mCSCC. The author hypothesizes that antibodies in the generated serum could aid the immune system in tumor volume reduction. The study results showed a reduction in tumor volume and altered expression of several cancer markers (p53, Bcl-xL, NF-κB, Bax) suggesting the potential effectiveness of this approach.

      Strengths:

      The approach shows potential effect on preventing tumor progression, from both the tumor size and the cancer biomarker expression levels bringing attention to the potential role of antibodies and B cell responses in cancer therapy.

      We greatly appreciate your positive feedback on our study.

      Weaknesses:

      These are some of the specific things that the author could consider to strengthen the evidence supporting the claims in their study.

      (1) The study fails to provide evidence of the specific effect of mCSCC-antibodies on mCSCC. The study utilized serum which also contains many immune response factors like cytokines that could contribute to tumor reduction. There is no information on serum centrifugation conditions, which makes it unclear whether immune components like antigen-specific T cells, activated NK cells, or other immune cells were removed from the serum. The study does not provide evidence of neutralizing antibodies through isolation, analysis of B cell responses, or efficacy testing against specific cancer epitopes. To affirm the specific antibodies' role in the observed immune response, isolating antibodies rather than employing whole serum could provide more conclusive evidence. Purifying the serum to isolate mCSCC-binding antibodies, such as through protein A purification, and ELISA would have been more useful to quantify the immune response. It would be interesting to investigate the types of epitopes targeted following direct tumor cell injection. A more thorough characterization of the antibodies, including B cell isolation and/or hybridoma techniques, would strengthen the claim.

      I am deeply appreciative of the reviewer's highly professional comments. Tumor development involves the coexistence of cancer cells at different developmental stages, each harboring a variety of known and unknown mutated proteins. These mutated proteins expose multiple known and unknown epitopes, each capable of stimulating the production of corresponding antibodies in healthy mice. Identifying all these antibodies presents a significant challenge. Current research methodologies, such as ELISA, WB, and ChIP, can only identify known antibodies based on existing antigens. A prerequisite for using these techniques is that both antigens and antibodies are identified. At present, there is no technology available to identify antibodies produced by an unknown mutated protein and epitope. However, I find the reviewer's comments insightful. Perhaps we can initially identify some known mCSCC-antibodies on mCSCC. However, studying the specific effect of these known mCSCC-antibodies on mCSCC is uncertain because we believe that tumor shrinkage results from the combined action of both known and unknown antibodies.

      We concur with the reviewer's observations regarding the use of serum, which is rich in immune response factors such as cytokines that could potentially contribute to tumor reduction. In our future research, we plan to systematically analyze the individual roles of these antibodies and cytokines in tumor reduction. In 1973, Nature published a report indicating that serum demonstrated promising results in tumor treatment (Immunotherapy of Cancer with Antibody in Rats. Nature 243, 492 (1973). https://doi.org/10.1038/243492b0). Since then, there have been scarcely any reports on serum therapy for tumors. The primary focus of our study is to evaluate the efficacy of serum therapy in treating tumors. We hypothesize that antibodies and cytokines form a complex interactive network, working in synergy to reduce tumors. Consequently, we believe that studying these antibodies and cytokines in isolation may not yield effective results.

      In this study, the methodology section outlines the process of serum preparation. It is important to note that serum is devoid of blood cells. I hypothesized that whole blood might have superior therapeutic effects compared to serum. This is because antibodies could potentially synergize with immune cells (including T cells, B cells, and NK cells), thereby enhancing the effectiveness of the treatment. As previously discussed, these antibodies, cytokines, and immune cells form a complex interactive network aimed at tumor reduction. Consequently, there are numerous factors that could influence the experimental outcomes, which presents a challenge for analyzing the results. Furthermore, the implementation of whole blood transfusion therapy introduces additional considerations, such as potential side effects and reactions associated with blood transfusions.

      We thank the reviewers for their suggestion to purify the serum in order to isolate mCSCC-binding antibodies. As we previously mentioned, separating a large number of both known and unknown serum antibodies presents a significant technical challenge. We are eager to discuss and consider suggestions from the reviewers regarding methods to identify a large variety and number of unknown antibodies on cells. Perhaps, as the reviewer suggested, we could begin with known antibodies and employ Protein A purification technology to purify these antibodies and subsequently detect immune responses. We could also categorize the types of epitopes targeted, direct tumor cell injection, to study the epitopes of these types in further studies. The suggestion to study the response of B cells is valuable, and we plan to conduct comprehensive research on the response and status of B cells in our future studies.  

      The purification of antibodies to enhance the specificity of their effectiveness against tumors is a critical aspect of our study. However, we would like to address some concerns raised. (1) The separation of all antibodies and cytokines presents a significant technical challenge. Particularly, there is a risk of overlooking antibodies that are present in low concentrations but play crucial roles. (2) What concerns us is that studying the composition separately would lose the overall effectiveness of the study. Our primary concern is that studying these components in isolation could compromise the holistic understanding of the study. This is akin to current research on traditional medicine, where the separation and individual study of compounds often result in a loss of overall therapeutic efficacy. For instance, consider a scenario where 100 antibodies collectively work to shrink a tumor. These antibodies interact with 20 cytokines, forming a complex network that enhances the cytokines' activity against tumor cells. Furthermore, many important antibodies and cytokines are currently unknown. Studying these antibodies in isolation could potentially result in the loss of this therapeutic effect. Therefore, in the discussion section, we have emphasized that our study considers a tumor mass, including tumor cells at various stages of development, as a single entity. As a practicing clinician, my primary focus is on the therapeutic outcomes in tumor treatments, despite the mechanisms of serum therapy remaining largely elusive, liking a black box.

      (2) In the study design, the control group does not account for the potential immunostimulatory effects of serum injection itself. A better control would be tumor-bearing mice receiving serum from healthy non-mCSCC-exposed mice. Additionally, employing a completely random process for allocating the treatment groups would be preferable. Also, the study does not explain why intravenous injection of tumor cells would produce superior antibodies compared to those naturally generated in mCSCC-bearing mice.

      I concur with the reviewer's perspective that using serum from healthy, non-mCSCC exposed mice as a control could potentially improve our study. Initially, our primary concern was to minimize harm to the mice and avoid excessive blood reactions, which led us to exclude the use of serum from healthy, non-mCSCC exposed mice in our control group. The main objective of our study was to investigate tumor shrinkage through serum treatment, specifically serum-derived antibodies. We anticipated that tumor-bearing mice receiving serum from healthy, non-mCSCC exposed mice would exhibit a response to the injected serum, which would manifest as a blood reaction. However, we did not expect this to result in a tumor treatment effect. If it turns out that normal serum (from healthy, non-mCSCC-exposed mice) possesses tumor-reducing properties, it would indeed be a novel discovery. We appreciate the reviewer's insightful suggestion and will consider incorporating it into our future research.

      We concur with the reviewer's observations that the use of a completely random process for assigning treatment groups would be more desirable. Indeed, the complete randomization of the entire process further underscores the efficacy and universality of serum therapy. In this study, we utilized paired mice to mitigate the risk of cross-infection and adverse reactions associated with blood transfusions. We deeply value the reviewer's expert feedback.  

      Lastly, the reason why tumor cells, when intravenously injected, produce antibodies superior to those naturally generated in mCSCC-bearing mice, is due to the following reasons. As tumor cells grow, they produce a variety of mutated proteins to adapt to the immune microenvironment and evade the immune system of mCSCC-bearing mice. However, these tumor cells with mutated proteins are exceptionally sensitive and recognizable to healthy mice. This recognition triggers an immune response in healthy mice, leading to the production of specific therapeutic antibodies. This simultaneous production of diverse and abundant antibodies is only achievable by living organisms.

      (3) In Figure 2B, it would be more helpful if the author could provide raw data/figures of the tumor than just the bar graph. Similarly in Figure 3, the author should show individual data points in addition to the error bar to visualize the actual distribution.

      Raw data (numerical values) have been incorporated into Figures 2B and 3, but the data is placed in the table below the graph. If placed above the error bar, it requires a small font and may not be clear.

      (4) The author mentioned that different stages of tumor cells have different surface biomarkers. Therefore, experimenting with injecting tumor cells at various stages could reveal the most immunogenic stage. Such an approach would allow for a comparative analysis of immune responses elicited by tumor cells at different stages of development.

      Yes, throughout the course of tumor development, tumor cells at various stages will exhibit distinct markers or possess different mutated proteins. The concept of segregating tumor cells from different stages and independently comparing their immune responses is indeed commendable. Future research could involve isolating cells that express identical biomarkers at each stage for a comparative analysis of the immune responses triggered by the tumor cells. However, this approach diverges from the original intent of this study.

      Most tumor cells exist within the same developmental stage. However, this does not imply that all tumor cells within the tumor mass are at the same stage. For instance, a stage III liver cancer tumor may contain both stage I and stage IV tumor cells. Moreover, due to the complexity of tumor development, not all tumor cell surface markers are identical, even for tumors at the same stage. For instance, 20 major proteins and 100 minor proteins are implicated in tumor formation. In fact, random mutations in just 5 of these major proteins and 10 minor proteins can instigate the development of tumors. This implies that the protein pattern (tumor cell surface markers) associated with each individual's tumor is unique. While studying tumor cells at different stages separately allows for the observation of the immune response of tumor cells at each stage, it lacks a comprehensive research and treatment effect. For this reason, the design of this study treats a tumor mass as a whole, encompassing both the primary stage tumor cells and those not in that stage. These tumor cells are then injected to produce corresponding therapeutic antibodies. Furthermore, if tumor cells from only one stage are isolated and specific antibodies are produced against these cells, it could lead to immune escape of tumor cells at other stages, preventing the tumor from shrinking. Therefore, our approach aims to address this issue by considering the tumor mass as a whole.

      (5) In the abstract the author mentioned that using mCSCC is a proof-of-concept for this potential cancer treatment strategy. The discussion session should extend to how this strategy might apply to other cancer types beyond carcinoma.

      We have incorporated an additional paragraph in the discussion section where we delve into the concepts and experimental principles underpinning this study. This, we believe, addresses the reviewer's query regarding the applicability of our study's methodology to other types of tumors. The process for other tumors also involves isolating cells from the tumor, stimulating therapeutic antibody production in healthy mice using these cells, and ultimately reintroducing these antibodies into mice with tumors to facilitate tumor elimination

      Recommendations For The Authors:

      The author is encouraged to refine the study's design in future studies considering the weaknesses highlighted above, summarize the results more effectively, and seek opportunities to expand on this promising idea and enhance the research's impact and applicability.

      We greatly appreciate the valuable suggestions provided by the editor and reviewers. These insights will certainly be addressed in our future research endeavors.

      Suggestions for title modification:

      Following the scope of the study, the term 'specific homologous neutralizing-antibodies' may be misleading as neutralizing antibodies typically refer to antibodies preventing viral cell entry. In cancer therapy, 'neutralization' is not a relevant concept, as cancer cells do not infect host cells. Using whole tumor cells as immunogens diverges from the specificity of traditional vaccination approaches that utilize well-defined proteins or antigens. Furthermore, the term "homologous" suggests a precision in targeting that is not demonstrated by reintroducing serum without isolating its specific components. Therapeutic effects should not be attributed to "neutralizing antibodies" without isolating or characterizing the antibody response or verifying their efficacy against specific cancer epitopes. Additionally, it is suggested that you indicate the biological system that your study utilised in the title. More so, this approach is not entirely novel, as seen with the use of adjuvants in some flu vaccines, or in Moderna's cancer vaccine mRNA-4157, which encodes up to 34 patient-specific tumor neoantigens. You can consider the title below or a variant of the same.

      Suggested title: Generating serum-based antibodies from tumor-exposed mice: a potential strategy in cutaneous squamous cell carcinoma treatment

      I concur with your suggestion and have modified the title to " Generating serum-based antibodies from tumor-exposed mice: a new potential strategy for cutaneous squamous cell carcinoma treatment ". I believe this research remains some new, hence the addition of the word "new". Furthermore, the term "novel" in the paper has been either removed or substituted.

      Moreover, I propose that this study shares similarities with Moderna's cancer vaccine mRNA-415, albeit with certain differences. Moderna's cancer vaccine mRNA-415 encodes 34 recognized neoantigens to stimulate an immune response by eliciting specific T cell responses. This is similar to the strategy of some companies developing a protein set for diagnosing lung cancer, liver cancer, among others. Without a doubt, these methods have improved the effectiveness of tumor diagnosis and treatment. However, I think that these methods currently face challenges in completely eradicating tumors because they perceive tumors as a static process and cells that express certain mutated proteins in a fixed manner. I believe that small molecule antibodies, cytokines, and immune cells present in serum that are difficult to detect, have low concentrations, or are unknown are essential for maintaining the expression of important mutant proteins and the escape of tumor cells. This is also the primary reason why tumors are difficult to treat and prone to recurrence at present.

      From my perspective, different tumors, as well as different stages of the same tumor, express varying mutated proteins or surface markers. Targeting some may result in others escaping or even creating a more conducive growth environment for those that do escape. Our study adopts a comprehensive view of a tumor block, encompassing tumor cells at different stages and tumor cells at the same stage but expressing different biomarkers. This approach generates a multitude of known and unknown antibodies that work in concert with cytokines and immune cells. While our method may not be capable of generating all mutated proteins and epitope antibodies due to the weakness of some antigens (epitopes of mutated proteins), it can still be effective. As long as the number of tumor cells is reduced below a certain threshold following multiple rounds of treatment with various antibodies produced at different stages, these cancer cells can be eradicated by the body's immune system. This is a process that is real-time and dynamic. Undoubtedly, if it becomes evident that alterations in a set of proteins can bolster the immune system and eradicate tumor cells, then the implications are significant. The immunotherapy proteins, which have demonstrated positive therapeutic effects, developed by certain companies are also predicated on this very principle.

      Finally, I greatly appreciate your suggestions, which will be considered and gradually addressed in future research.

    1. Reviewer #1 (Public Review):

      Summary:

      Mao and colleagues re-analysed published spatial, bulk and single-cell transcriptomic datasets from primary colorectal cancers and colorectal cancer derived liver metastases. The analyses of paired cancer and non-cancer tissue samples showed that T cells are enriched in tumour tissue, accompanied by a reduction in the fraction of NK cells in the cancer tissue transcriptional datasets. Furthermore, the authors show that tumour tissue has higher fraction of GZMK+ (resting) NK cells and suggested a correlation between the presence of these cells and poor prognosis for cancer patients. In contrast, the increased frequency of KIR2DL4+ (activated) NK cells correlates with improved survival of cancer patients.

      Strengths:

      Authors performed a comprehensive analysis of published datasets, integrating spatial and single-cell transcriptomic data, which allowed them to discover enrichment of GZMK+ NK cells in cancer tissues.

      Weakness:

      The authors provided insufficient experimental evidence to support their claim that GZMK+ NK cells contribute to worse prognosis for cancer patients or promote cancer progression. While one can visually observe an increased fraction of GZMK+ NK cells compared to KIR2DL4+ NK cells in cancer tissues, no quantification is shown. They did not present any preclinical (animal model) or clinical data suggesting a causal relationship between NK cells and tumour growth. Thus, while a correlation may exist between the presence of GZMK+ NK cells and poorer tumour prognosis, causation cannot be claimed based on the available evidence. Furthermore, the in vitro data provided is limited to a single NK cell line derived from a lymphoma patient, which does not fully represent the diversity and functionality of human NK cells.

    2. Reviewer #2 (Public Review):

      Summary:

      This manuscript investigates the role of the abundant NK cells that are observed in colon cancer liver metastasis using sequencing and spatial approaches in an effort to clarify the pro and anti-tumourogenic properties of NK cells. This descriptive study characterizes different categories of NK cells in tumor and tumor adjacent tissues and some correlations. An attempt has been made using pseudotime trajectory analysis but no models around how these NK cells might be regulated is provided.

      Strengths:

      This study integrates multiomics data to attempt to resolve correlates of protection that might be useful in understanding NK cell diversity and activation. The authors have strengthened the study in revision by demonstrating the very strong correlation between Granzyme+ NK cells and the poor prognosis, but the main claims are only partially supported.

      Weaknesses:

      While this work is interesting, the power of such studies are in taking the discovered information and applying this to other cohorts to determine the strength and predictive power of the genes identified. It is also clear that these 'snapshots' analysed poorly take account of the dynamic temporal changes that occur within a tumour. It would have been good to see a proposed model of NK cell regulation as it might occur in the tumour (accounting for turnover and recruitment) beyond the static data. Further evidence linking mechanistic causality to prognostic outcome would provide significant data for approaches forward.

    3. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer#1:

      Comment #1: It is unclear how the fraction of NK cell populations is quantified in the spatial-seq datasets. Figures display spatial data with expression scores, but the method for calculating the score and determining NK cell presence in tumor tissue is ambiguous. Clarification is needed on whether the identification relied solely on visual inspection or if quantitative analyses using other criteria were conducted.

      Thank you for your questions. We removed the background and made the accordingly modifications according to your demand. We used the AddModuleScore function in Seurat to quantify the main immune subpopulations in spatial-seq using the gene sets identified in single-cell-seq. Additionally, the tumor and non-tumor region was identified by immunohistochemistry as well as cell clusters in spatial-seq, it is rough that we can't quantify the NK cell presence in each region precisely. The consolation is that the differences of NK cell presence in tumor and non-tumor region is observable by visual inspection. The methodology has been supplemented in the revised manuscript (line 190-193).

      Comment #2: The authors do not provide a clear definition of "resting" NK cells. It remains unclear whether they refer to a senescent state or a non-matured NK cell population. Furthermore, the criteria used to define resting and activated cells based on the expression of KIR2DL4, GPR183, GRP171, CD69, IFNG, GZMK, TTC38, CD160, and PLEKNF1 in Figure 4 are not well-defined. The expression patterns of these genes in Figure 4D are not distinct, and it is unclear which combination of genes was used to classify the populations. Clarification is needed on whether the presence of GZMK alone defines resting NK cells, or if the presence of any of the described genes (GZMK, TTC38, or CD160) is sufficient. Additionally, the method used for this classification, whether visual or algorithm-based, should be described.

      Thank you for your question. The resting and activated NK cells was defined by the preferential expression of the described resting genes (AZU, BPI, CAMP, CD160,CD2, CDHR1, CEACAM8, DEFA4, ELANE, GFI1, GZMK, KLRC4, MGAM, MS4A3, NME8, PLEKHF1, TEP1, TRBC1, TTC38, ZNF135) and activated NK genes (APOBEC3G, APOL6, CCL4, CCND2, CD69, CDK6, CSF2, DPP4, FASLG, GPR171, GPR18, GRAP2, IFNG, KIR2DL4, KIR2DS4, LTA, LTB, NCR3, OSM, PTGER2, SOCS1, TNFSF14) in CIBERSORT. Actually, these marker genes were not specifically expressed in a single NK cells subset. On the other hand, combined with further flow cytometric analysis verification, the resting NK cell tend to be a decidual-like NK cells and tumor- infiltrated NK cells with higher expression of CD9, CD49a and PD-1.

      Comment #3: Criteria used to define high or low NK cell presence/infiltration in Figure 5 are not described in the main text or figure legend. Since, the claim that the presence of the resting or activated NK cells predicts cancer prognosis is based on this figure, this needs to be clearly described.

      Thank you for your questions. The activated and resting NK cell percentage in TCGA and GSE29623 was determined by CIBERSORT. Additionally, the infiltration of activated and resting NK cell was also determined by the AddModuleScore function using the gene sets of activated and resting NK cell identified in single-cell-seq, the differences of activated and resting NK cell presence in tumor and non-tumor region is also determined by visual inspection. We have amended in the main text and figure legend in the revised manuscript.

      Comment #4: The absence of FMO controls for KIR2DL4 or GZMK and the lack of increase in GZMK expression during co-culture with tumour lines raises concerns since GZMK was used as a defining feature of resting NK cells.

      Thank you for your questions. We did a new batch of flow experiments and FMO controls of all the markers used in the experiments were set up to define the precise positive gate locations.

      Author response image 1.

      The positive gate locations of CD56, GZMK, KIR2DL4, CD9, CD49a, PD-1 defined according to the FMO control.

      Comment #5: All the co-cultures were performed with tumour cell line only and no healthy cells, such as human foreskin fibroblasts, were used as control. In the absence of a non-tumour cell line, it is very difficult to draw any conclusions. Furthermore, to claim that resting or activated NK cells are responsible for tumour migration or proliferation, it is important to at least isolate resting and activated NK cells ex vivo and culture with tumour lines, instead of NK cell lines.

      Thank you for your questions. According to your suggestion, NK cells were co-cultured with human foreskin fibroblasts, the phenotype was identified by Flow cytometry. When co-cultured with HFF in direct contact (CN group), NK cells were also tending towards tissue infiltration state (high expression of CD9). However, the domestication effect is significantly reduced compared to co-culturing with tumor cells. Additionally, unlike supernatant of CNS group (NK and HCT were in contact) from NK and HCT co-culture system could significantly increase the migration of fresh HCT, fresh HCT underwent a limited increase (no statistical significance was found) in migration when cultured in the supernatant from the co-culture system in which NK and HFF were in contact (CNS group), but not when co-cultures were performed in the cell supernatant (SNS group) and fresh medium (MNS group). Finally, we tried to isolate resting and activated NK cells from fresh colon cancer surgical specimen. Unfortunately, the NK cells were too few to perform further functional experiments such as migration and proliferation.

      Author response image 2.

      Phenotype switch of NK cells in different co-cultured system and the corresponding NK cell-mediated effect on cell migration of fresh colon cancer cell (HCT-116).

      A-B: NK cells underwent phenotype switch (high expression of CD9) when cocultured with HCT and HFF, the phenotype switch was more obvious when co-cultured with HCT. CN: NK cells cocultured with HCT/HFF; SN: NK cells cocultured with supernatant of HCT/HFF; MN: NK cells cocultured in fresh medium. C-E: Transwell assay showed the only tumor co-cultured NK mediated the inductive effect on cell migration of colon cancer cell (HCT-116). CNS: Colon cancer cells were cultured in the supernatant from co-culture system that NK and HCT/HFF were cultured in direct contact; SNS: Colon cancer cells were cultured in the supernatant from co-culture system that NK cocultured with supernatant of HCT/HFF; MNS: Colon cancer cells were cultured in the fresh medium.

      Comment #6: It seems that flow cytometric analyses and GZMK and KIR2DL4 staining were performed without cell permeabilization. Could authors confirm if this is accurate, or if they performed intracellular staining instead?

      Thank you for your questions. For GZMK, which known as the secretory protein, flow cytometric analyses were performed both with (Fig.3) and without cell fixation and permeabilization, no significant differences were found among each group. The difference is that GZMK was nearly all negative without fixation and permeabilization while it is all positive with fixation and permeabilization. Conditions of flow cytometry analyses for GZMK may need further optimization or GZMK may not be a suitable flow cytometric marker for resting NK cells. On the other hand, for membrane protein such as CD56, CD9, CD49a, KIR2DL4, PD-1, staining was performed without cell permeabilization.

      Author response image 3.

      Phenotype switch (CD56+, GZMK+) of NK cells was analyzed by FACS after fixation and permeabilization in different co-cultured groups. CN: NK cells cocultured with colon cancer cells; SN: NK cells cocultured with supernatant of cancer cells; MN: NK cells cocultured in fresh medium.

      Comment #7: The identity of the published datasets used for analysis is not provided, and references are not cited in the results section.

      Thank you for your questions. We are sorry for the neglect of our previous work. We have added the information in the revised manuscript (section of Materials and Methods) (Line 123-128).

      Comment #8: References are difficult to locate, as the main text follows APA style while the reference section is organized numerically with no clear order.

      Thank you for your questions. We have modified the format of the references in the revised manuscript.

      Comment #9: Figure 3 shows volcano plots showing DEG genes between tumor and healthy tissue NK cells are not described clearly, and authors did not discuss the significance of these genes, highlighted in the plot.

      Thank you for your questions. Volcano plots of Figure 3 showed the DEGs between colon cancer with metastasis and without metastasis in TCGA database. We focused on the genes which were enriched in the pathway of “Natural killer cell mediated cytotoxicity” and found nearly all the genes enriched in the pathway were down-regulated in the colon cancer with metastasis. We have modified the description in the result section and added the description of importance of these genes in the discussion section in the revise manuscript (Line 322-326).

      Comment #10: The meaning of "M0" and "M1" in Figures 5A and 5B is unclear and should be defined in the text.

      Thank you for your questions. "M0" and "M1" in Figure 5A and 5B means “colon cancer without metastasis” and “colon cancer with metastasis”, respectively. We have modified in the revise manuscript (Line 350-354).

      Comment #11: Terms such as "dynamic remodelling of NK cells" and "landscape of NK cells" are used without explanation, necessitating clarification of their meaning.

      Thank you for your questions. We have modified in the revise manuscript (Line 331-334).

      Comment #12: In vitro assays are described vaguely, making it difficult for readers to understand. More clarity is needed in describing these assays.

      Thank you for your questions. We have added clarification in the revise manuscript (Line 205-211).

      Reviewer #2:

      Comment #1: This manuscript investigates the role of the abundant NK cells that are observed in colon cancer liver metastasis using sequencing and spatial approaches in an effort to clarify the pro and anti-tumorigenic properties of NK cells. This descriptive study characterises different categories of NK cells in tumor and tumor-adjacent tissues and some correlations. An attempt has been made using pseudotime trajectory analysis but no models around how these NK cells might be regulated are provided.

      Thank you for your questions. The single-cell sequencing data enrolled in this study are CD45 positive immune cells and do not involve tumor cells, cellular communication analysis between NK cells and tumor cells cannot be conducted. The change process of NK can only be predicted through pseudotime trajectory analysis. Our hypothesis is that tumor cells domesticate NK cells into a tumor- infiltrated NK cells through direct contact, and flow cytometry experiments have also confirmed that tumor cells can only have such domestication through direct contact with NK cells (with prominent high expression of CD9). However, the detailed mechanism remained unclear.

      Comment #2: A small number of patients are analyzed in this study. The descriptive gene markers, while interesting, need to be further validated to understand how strong this analysis might be and its potential application.

      Thank you for your questions. The sample size included in this study is indeed a bit small, which is also a limitation of our study. However, this is the only large sample single-cell sequencing dataset could be found that includes primary colon cancer tissues, paired paratumor normal colon tissues, paired liver metastatic cancer tissue, and paired paratumor normal liver tissues. We will expand the sample size to further verify the current conclusion in subsequent experiments. In addition, the marker genes of different NK groups used in this study refer to the CIBERSORT's classification of activated NK cells and resting NK cells, which is a widely recognized indicator. We will verify the expression and clinical application value of the screened genes in tissues in subsequent studies.

      Comment #3: Figure 1C and other figures throughout the paper. It is not clear how marker genes were selected.

      Thank you for your questions. The marker genes displayed in the Figure.3C were the highly variable genes of each cell group as well as the marker genes of each immune cells, such as T cells (CD3D, CD3E), NK cells (NKG7, KLRD1), monocytes (LYZ, S100A8, S100A9), B cells (CD79A), plasma cells (JCHAIN, IGHA1, IGHA2), Neutrophils (CXCL8, FCGR3B).

      Comment #4: Figure 1E. P and T have not been defined. Lines should not connect the datasets as they are independent assessments.

      Thank you for your questions. P and T means paratumor normal tissues and tumor tissues, respectively. Which have been added in the caption of Figure 1E. Additionally, the single cell sequencing samples included in the study were paired, with primary colon cancer tissues, paired normal tissues adjacent to colon cancer, paired liver metastatic cancer tissue, and paired normal liver tissues from 20 colon cancer patients with liver metastasis, paired test analysis was thus performed.

      Comment #5: Figure 2C. It is unclear what ST-P1 means. This is not a particularly informative figure.

      Thank you for your questions. We are sorry that it was our annotation error. Actually, it is the spatial transcriptome of the primary colon cancer tissue and liver metastasis tissue of four patients. We have made the modifications in the revised manuscript.

      Comment #6: Multiple figures - abbreviations are used but not provided in the legend. They occur in the text but are not directly related to the figures where they are used to label axes or groups.

      Thank you for your questions. We have rechecked and made corresponding modifications in the revised manuscript.

      Comment #6: Patients: it is not clear what other drugs patients have been exposed to or basic data (sex, age, underlying conditions etc)

      Thank you for your questions. The baseline data of the patient of SC dataset and ST dataset were showed in the Table.1 and Table.2 followed, respectively. They were not presented before as no patients characteristics related analysis was performed in the current study.

      Author response table 1.

      The baseline data of patient from single cell sequencing database.

      Author response table 2.

      The baseline data of patient from spatial transcriptome database.

    1. Author response:

      The following is the authors’ response to the current reviews.

      Reviewer #1 (Public Review): 

      In the manuscript "Mechanistic target of rapamycin (mTOR) pathway in Sertoli cells regulates age-dependent changes in sperm DNA methylation", the authors proposed to test if the balance of mTOR complexes in Sertoli cells may play a significant role in age-dependent changes in the sperm epigenome. The paper could be of interest and has a good scientific aim but there are too many drawbacks that hamper the initial enthusiasm. All sections need extensive revision. The paper is mostly descriptive without a mechanistic-orientated explanation for the observed results. 

      Comments on revised version: 

      I am not sure that the authors have made an attempt to clearly answer the reviewers comments that aimed to improve the quality of the manuscript. It stands as mostly descriptive and with limited interest as it is. 

      We are thankful to the reviewer for agreeing to review our revised manuscript. Unfortunately, we completely disagree with the evaluation provided by the reviewer. Research on sperm DNA methylation experienced a significant rise of interest in the current century and by now more than 2000 papers have been published. Although it was demonstrated that the sperm DNA methylome may be affected by almost every factor analyzed, no study was published to identify molecular mechanisms that may link these factors with the sperm epigenome. Our study is the FIRST to identify such a mechanism (mTOR complexes balance in Sewrtoli cells). More so, we demonstrated experimentally that manipulations of this mechanism allow regulation of the rates of epigenetic aging of sperm in both directions (accelerate aging or rejuvenate). Thus, our study provides a mechanistic background for the development of therapeutic interventions that may target sperm epigenome.

      We acknowledge that our study does not provide the full cascade of events linking the balance of mTOR complexes in Sertoli cells with the sperm DNA methylome. It suggests, however, the most plausible event next in a cascade (BTB permeability changes). Our group is working on this question now and we hope to provide the answer soon in a separate study. Even after that, we will be far from understanding the complete chain of molecular events that link mTOR and sperm methylome. It may take many years and significant effort of many research groups to dissect the whole cascade. It is worth mentioning that understanding of a complete cascade involved in pathology is not needed to develop efficient therapies if the critical nodes are known. For many common drugs (e.g. metformin) we do not know the full chain of molecular mechanisms but use them successfully.

      Thus, we believe that our study is mechanistic as it identified a critical mechanism manipulation of which allows experimental aging and rejuvenation of the sperm methylome. Additionally, it generates new mechanistic questions and hypotheses to be answered in the future.

      Reviewer #3 (Public Review): 

      Summary and Strength: 

      The manuscript by Amir et al. describes that Sertoli-specific inactivation of the mTORC1 and mTORC2 complex by KO of either Raptor or Rictor, respectively, resulted in progressive changes in blood-testis-barrier (BTB) function, testis weight, and sperm parameters, including counts, morphology, mtDNA content and sperm DNA methylation. 

      The described studies are based on the hypothesis that a decline of BTB function with increasing chronological age of a male contributes to the DNA methylation changes that are known to occur in sperm DNA of old males when compared to sperm DNA from isogenic young males. In order to demonstrate the relevance of a functioning BTB for the maintenance of sperm methylation patterns, the authors generated mice with genetically disrupted mTORC2 complex or mTORC1 complex in Sertoli cells and determined sperm methylation patterns in comparison to isogenic wild-type males. In line with previously published scientific literature (e.g. Mok et al., 2013; Dong et al, 2015; and others), the manuscript corroborates that a Sertoli-cell specific deletion of mTORC2 caused a loss of BTB function and a progressive spermatogenic defect. The authors further show that sperm DNA is differentially methylated (DMRs) as a consequence of either a mTORC2 disruption (associated with a loss of BTB function) or following a mTORC1 disruption (BTB function either increased or not leaky) when compared to their isogenic age-matched wt controls. Those DMRs overlap partially with changes in sperm DNA methylation that were found when comparing sperm from 8-week males with sperm isolated from 22-week-old male mice. 

      The authors interpret the observed changes as representative of the sperm DNA methylation changes that occur during normal chronological aging of the male. For an aged control group, the authors use sperm DNA of 22-week-old wild-type mates from the mTORC2 and mTORC2 KO breeding and compare the sperm methylation patterns found in sperm from those 22-week males to 8-week young males, that are intended to represent an old and a young cohort, respectively. DNA methylation analysis indicates that a disruption of mTORC2 (& decrease of BTB function) results in increased DNA methylation of sperm DNA, while a disruption of mTORC1 (and proposed increase of BTB tightness, not shown in the manuscript, though) resulted in increased hypomethylation. 

      Weaknesses: 

      While the hypothesis and experimental system are interesting and the data demonstrating the relevance of the mTORC2 complex for BTB function is convincing, several open questions limit the evidence that supports the hypothesis that the sperm DNA methylation changes seen in old males are caused by BTB failure following an imbalance of mTOR signaling complexes. The major critique points are the lack of a chronologically old group and the choice of 8 weeks & 22 weeks age of age: 

      - Data illustrating the degree of BTB decline and sperm DNA methylation changes from chronologically "old" male mice is missing. 22-week-old mice are not considered old but are of good and mature breeding age, equivalent to humans in their mid-late twenties. (In the manuscript, the 22-week-old wildtype mice show no evidence of BTB breakdown (Figure 3), so why are their sperm used to represent "aged" sperm? 

      - Adding a group of "old" wild-type mice of 12-14 months of age, which is closer to the end of effective reproduction in mice, more equivalent to 45-59 year-old humans) could be used to illustrate that (a) aging causes a marked decrease in BTB function at this time in mouse life, and that this BTB breakdown chronologically aligns with the age-associated DNA hypermethylation seen in old sperm. Age-matched "old" mTORC1 KO, with a (supposedly) tighter BTB barrier, could then be expected to have a sperm DMA methylation profile closer to that of younger wild-type animals. Such data are currently missing. While the progressive testicular decline observed in the mTORC1 KO (Fig.5) could make it difficult to obtain the appropriately aged mTORC1 KO tissues, it is completely feasible to obtain data from chronologically old wild-type males. (The progressive testicular decline further raises the question of what additional defects the KO causes, and how such additional defects would influence the sperm DNA methylation profile.) The addition of data from an old group to the currently included groups could strengthen the interpretation that the observations in the BTB-defective mTORC2 KO mice are modelling an age-related testicular decline, provided that the DMRs seen in the chronologically old group significantly overlap with the BTB-defective changes. 

      - In the current form, the described differences in sperm DNA methylation are based on comparisons between pubertal mice (8 weeks) and mature but not old adult males (22 weeks), while a chronologically "old" group is missing from the data sets and comparisons. Thus, it appears that the described sperm methylation changes reflect developmental changes associated with normal maturation and not necessarily declining sperm quality due to aging. (Sperm obtained from 8-week-old mice likely were generated, at least in part, during the 1st wave of spermatogenesis, which is known to differ from the continuously proceeding spermatogenesis during the remained of the mature life. During the 1st wave of spermatogenesis, Sertoli cells are known to undergo gene expression changes which could contribute to varying degrees of BTB function, and thus have effects on the sperm DNA methylation profiles of such 1st wave sperm.) 

      - It is unclear why the aging-related DMRs between the 8 and 22-week-old wild-type mice vary so dramatically between the two wild-type groups derived from the mTORC1 and the mTORC2 breeding (Fig. S4). If the main difference was due to mTORC1 or mTORC2 activity, both wildtype groups should behave very similarly. Changes seen in a truly "old" mouse (e.g. 20 weeks to 56 weeks), changes in "young mTORC1" and in "old mTORC2" are missing.

      How do those numbers and profiles compare to the shown samples? 

      Comments on latest version: 

      The rebuttal letter and public response indicate the authors' reluctance to consider the limitations of their study, i.e. having chosen chronologically young animals to demonstrate a sperm aging effect and indicate that they are not willing to include adequate controls. 

      Since there is no evidence that mice at this young age have a deteriorating blood-testis-barrier (indeed, normal intact BTB is clearly visible in the figures included in this study from animals of the relevant age group), the whole central hypothesis that the study is built upon (i.e. that increasing age causes deteriorating BTB integrity which in turn causes age-related changes in sperm DNA methylation), appears irrelevant or invalid. 

      The authors' claim that age-related DNA methylation changes in sperm occur in linear fashion and that the changes are somewhat proportional with chronological age is in stark contrast of the claim that a decline of the BTB in old animals is causative for age-related sperm epigenetic changes, putting the relevance of the whole study in question. 

      We are thankful to the reviewer for agreeing to review our revised manuscript. We disagree with the evaluation provided by the reviewer, however.

      First, the reviewer misinterpreted the hypothesis of the study, although it is formulated in the last sentence of the Introduction:  “ … we hypothesized that the balance of mTOR complexes in Sertoli cells may also play a significant role in age-dependent changes in the sperm epigenome.” Instead, the reviewer assigned a different hypothesis to our study (that BTB integrity changes are responsible for age-dependent changes in sperm DNA methylation) and criticized us for not providing clear testing of this hypothesis.

      To clarify, we believe that our study provides high-quality testing of OUR hypothesis as we demonstrated experimentally that manipulations of mTOR complexes balance in Sertoli allow acceleration and deceleration of epigenetic aging of sperm. Additionally, our study generated a hypothesis that BTB permeability may mediate the effects of the mTOR pathway on sperm methylome. This second hypothesis is to be tested in the future research.

      We also disagree with the reviewer's interpretation of the aging process as an abrupt transition from a young, healthy, and undamaged state to an old, moribund, and damaged state. The whole body of biogerontological knowledge suggests instead steady accumulation of damage over lasting periods of time. For example, this understanding of steady change at the molecular level allowed the development and successful use of epigenetic clock and other molecular clock models, including several variants of sperm epigenetic clocks. These models clearly demonstrate linear or semi-linear accumulation in DNA-methylation changes in various tissues and biological species across the whole lifespan. It is reasonable to assume that BTB permeability decreases with age steadily as well and that in younger animals this decrease may be not easily detected by the existing analytical methods. Experimental data showing the dynamics of the BTB deterioration over age do not exist to our knowledge although it was demonstrated that older animals have loose BTB as compared with young. We agree with the reviewer that future studies testing the role of BTB deterioration for sperm methylome aging will need to provide such evidence. It was not the subject of the current study, however.


      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      In the manuscript "Mechanistic target of rapamycin (mTOR) pathway in Sertoli cells regulates age-dependent changes in sperm DNA methylation", the authors proposed to test if the balance of mTOR complexes in Sertoli cells may play a significant role in age-dependent changes in the sperm epigenome. The paper could be of interest and has a good scientific aim but there are too many drawbacks that hamper the initial enthusiasm. All sections need extensive revision. The paper is mostly descriptive without a mechanistic-orientated explanation for the observed results.

      Specific comments:

      (1) The abstract is poorly written. There is a lot of unnecessary introduction that does not provide a rationale for the work. It is not possible to understand the experimental approach or the major data just by reading the abstract. It does not clearly represent the work.

      - We have added details of experimental design and results to the abstract and reduced the introductory part of the abstract.

      (2) The introduction is somewhat vague and does not provide a clear rationale for the hypothesis. There should be more focus more on the role of mTOR in Sertoli cells that goes far beyond BTB. That will give more focus on mTOR. Then it is important to focus on BTB and mTOR: what is known? What is the gap and how can it be solved? Several relevant references are missed concerning mTOR and Sertoli cells.

      - The goal of this study was not to explore all potential roles of mTOR pathway in Sertoli cells, but to test if shifts in the balance of mTOR complexes regulate (accelerate/decelerate) epigenetic aging of sperm. As such, we disagree with the reviewer and consider that the current Introduction provides a focused rational for the study.

      (3) The Material and Methods section needs improvement. There is much important information missing. For instance: how many animals were used per group and how was the breeding done? At what age? Statistical analysis should be explained in detail.

      - The number of animals was clearly stated in the original manuscript. We have added details of breeding and statistical analysis. 

      (4) The results description could be improved. It is vague without highlighting how much difference was detected. The results should be numerically described when possible and the differences should be highlighted. A 10% difference may be significant but not biologically relevant. To correctly evaluate the differences it is important to describe them with some degree of detail.

      - For all DNA methylation experiments we provide numerical characteristics of methylation changes, including numbers of DMRs, % change, significance, correlation coefficients. We believe that only age- and genotype-associated changes in reproductive parameters were not characterized in our manuscript in detail. We have added Table 1 to provide these numbers.

      (5) There is no discussion of the data. The authors just summarize their findings without a comprehensive analysis of the literature and how the effects can be mediated. mTOR interacts with different pathways (mTORC1 and mTORC2 are even mediators of distinct pathways). This would be very relevant to discuss. In addition, there are many study limitations not discussed. There is no clear mechanistic explanation of the way by which the mTOR pathway in Sertoli cells regulates age-dependent changes in sperm DNA methylation. The paper seems preliminary.

      - We have added an additional paragraph to the discussion to highlight a potential molecular mechanism that links mTOR pathway with the sperm epigenome.

      (6) Figure 1 is too simple and does not provide any schematic support for the text.

      - We disagree with the reviewer and believe that the figure represents a good visualization of our hypothesis useful for the perception of the study.

      (7) Figure 2 lacks some detail. For instance, how many animals were used for each step?

      - Numbers of animals are provided in the text of the paper.

      (8) Taking into consideration the roles of mTOR on sperm, particularly mTORC1, it is not clear whether there were any differences in sperm motility.

      - We did not assess sperm motility in this study. 

      Reviewer #2 (Public Review):

      In this study, the authors hypothesized that the balance of mTOR complexes in Sertoli cells may also play a significant role in age-dependent changes in the sperm epigenome. To test this hypothesis, the authors use transgenic mice with manipulated activity of mTOR complexes in Sertoli cells. These results suggest that the mTOR pathway in Sertoli cells may be used as a novel target of therapeutic interventions to rejuvenate the sperm epigenome in advanced-age fathers.

      The authors attempt to demonstrate that the balance of mTOR complexes in Sertoli cells regulates the rate of sperm epigenetic aging. The authors have effectively met their research objectives, and their conclusions are supported by the data presented.

      - We are very thankful for the positive evaluation of our study.

      Reviewer #3 (Public Review):

      Summary and Strength:

      The manuscript by Amir et al. describes that Sertoli-specific inactivation of the mTORC1 and mTORC2 complex by KO of either Raptor or Rictor, respectively, resulted in progressive changes in blood-testis-barrier (BTB) function, testis weight, and sperm parameters, including counts, morphology, mtDNA content and sperm DNA methylation.

      The described studies are based on the hypothesis that a decline of BTB function with increasing chronological age of a male contributes to the DNA methylation changes that are known to occur in sperm DNA of old males when compared to sperm DNA from isogenic young males. In order to demonstrate the relevance of a functioning BTB for the maintenance of sperm methylation patterns, the authors generated mice with genetically disrupted mTORC2 complex or mTORC1 complex in Sertoli cells and determined sperm methylation patterns in comparison to isogenic wild-type males. In line with previously published scientific literature (e.g. Mok et al., 2013; Dong et al, 2015; and others), the manuscript corroborates that a Sertoli-cell specific deletion of mTORC2 caused a loss of BTB function and a progressive spermatogenic defect. The authors further show that sperm DNA is differentially methylated (DMRs) as a consequence of either a mTORC2 disruption (associated with a loss of BTB function) or following a mTORC1 disruption (BTB function either increased or not leaky) when compared to their isogenic age-matched wt controls. Those DMRs overlap partially with changes in sperm DNA methylation that were found when comparing sperm from 8-week males with sperm isolated from 22-week-old male mice.

      The authors interpret the observed changes as representative of the sperm DNA methylation changes that occur during normal chronological aging of the male. For an aged control group, the authors use sperm DNA of 22-week-old wild-type mates from the mTORC2 and mTORC2 KO breeding and compare the sperm methylation patterns found in sperm from those 22-week males to 8-week young males, that are intended to represent an old and a young cohort, respectively. DNA methylation analysis indicates that a disruption of mTORC2 (& decrease of BTB function) results in increased DNA methylation of sperm DNA, while a disruption of mTORC1 (and proposed increase of BTB tightness, not shown in the manuscript, though) resulted in increased hypomethylation.

      Weaknesses:

      While the hypothesis and experimental system are interesting and the data demonstrating the relevance of the mTORC2 complex for BTB function is convincing, several open questions limit the evidence that supports the hypothesis that the sperm DNA methylation changes seen in old males are caused by BTB failure following an imbalance of mTOR signaling complexes. The major critique points are the lack of a chronologically old group and the choice of 8 weeks & 22 weeks age of age:

      - Data illustrating the degree of BTB decline and sperm DNA methylation changes from chronologically "old" male mice is missing. 22-week-old mice are not considered old but are of good and mature breeding age, equivalent to humans in their mid-late twenties. (In the manuscript, the 22-week-old wildtype mice show no evidence of BTB breakdown (Figure 3), so why are their sperm used to represent "aged" sperm?

      - Adding a group of "old" wild-type mice of 12-14 months of age, which is closer to the end of effective reproduction in mice, more equivalent to 45-59 year-old humans) could be used to illustrate that (a) aging causes a marked decrease in BTB function at this time in mouse life, and that this BTB breakdown chronologically aligns with the age-associated

      DNA hypermethylation seen in old sperm. Age-matched "old" mTORC1 KO, with a (supposedly) tighter BTB barrier, could then be expected to have a sperm DMA methylation profile closer to that of younger wild-type animals. Such data are currently missing. While the progressive testicular decline observed in the mTORC1 KO (Fig.5) could make it difficult to obtain the appropriately aged mTORC1 KO tissues, it is completely feasible to obtain data from chronologically old wild-type males. (The progressive testicular decline further raises the question of what additional defects the KO causes, and how such additional defects would influence the sperm DNA methylation profile.) The addition of data from an old group to the currently included groups could strengthen the interpretation that the observations in the BTB-defective mTORC2 KO mice are modelling an age-related testicular decline, provided that the DMRs seen in the chronologically old group significantly overlap with the BTB-defective changes.

      - In the current form, the described differences in sperm DNA methylation are based on comparisons between pubertal mice (8 weeks) and mature but not old adult males (22 weeks), while a chronologically "old" group is missing from the data sets and comparisons. Thus, it appears that the described sperm methylation changes reflect developmental changes associated with normal maturation and not necessarily declining sperm quality due to aging. (Sperm obtained from 8-week-old mice likely were generated, at least in part, during the 1st wave of spermatogenesis, which is known to differ from the continuously proceeding spermatogenesis during the remained of the mature life. During the 1st wave of spermatogenesis, Sertoli cells are known to undergo gene expression changes which could contribute to varying degrees of BTB function, and thus have effects on the sperm DNA methylation profiles of such 1st wave sperm.)

      - It is unclear why the aging-related DMRs between the 8 and 22-week-old wild-type mice vary so dramatically between the two wild-type groups derived from the mTORC1 and the mTORC2 breeding (Fig. S4). If the main difference was due to mTORC1 or mTORC2 activity, both wildtype groups should behave very similarly. Changes seen in a truly "old" mouse (e.g. 20 weeks to 56 weeks), changes in "young mTORC1" and in "old mTORC2" are missing. How do those numbers and profiles compare to the shown samples?

      Some general comments regarding the chosen age of animals:

      - As mentioned, sperm from 8-week-old mice represent many sperm that were produced in the 1st wave of spermatogenesis; 22-week-old mice are not considered chronologically old mice, but mature and "relatively" young animals. 18-24 month-old mice are considered to be equivalent to 56-69 year-old humans, and might be more suitable to detect aging effects. "Old mice" for study purposes should be at least 12-14 months of age, ideally >18 months of age. 22 weeks (5 months of age) are mice at good breeding age, but still considered mature adults, not old males, and therefore are not expected to show typical aging health problems (like declining fertility).

      Even the cited reference (Flurkey et al. 2007) defines that "... mice used a reference group for "young mice" should be at least 3 months of age (~ 13 weeks), i.e. fully sexually mature. The authors specifically state: " The young adult group should be at least 3 months old because, although mice are sexually mature by 35 days, relatively rapid maturational growth continues for most biologic processes and structures until about 3 months. The upper age range for the young adult group is typically about 6 months. ... For the middleaged group, 10 months is typically the lower limit.... The upper age limit for the middleaged group is typically 14-15 months, because at this age, most biomarkers still have not changed to their full extent, and some have not yet started changing. For the old group, the lower age limit is 18 months because age-related change for almost all biomarkers of aging can be detected by then. The upper limit is 22-26 months, depending on the genotype." According to this reference, mice up to 6 months of age are generally considered "mature adults" (equivalent to humans 20-30 yrs), mice of 10-14 month are "middle-aged adults" (equivalent to ~38-47 human years) and 18-24 month mice are "old" (equivalent to human of 56-69 yrs.).

      Going on these commonly used age ranges, it is unclear why the authors used 8-week-old mice (generally considered pubertal to late adolescent age) as young mice and 5-month-old mice as "old mice".

      Differences seen between these cohorts most likely do not reflect aging, but more likely reflect changes associated with normal developmental maturation, since testis and epididymides continue to grow until about 10-11 weeks of age.

      - The DMRs identified between 8 and 22-week-old animals could represent DMRs that are dependent on developmental maturation more than being changed in an "age-dependent" manner (in the sense of increased chronological age). This interpretation is congruent with the fact that those DMRs are enriched for developmental categories.

      - We are thankful to the reviewer for a detailed explanation of their disagreement with the ages of mice used in this study. In short, the reviewer suggests that our older group (22 weeks) is not old enough to represent aged animals and our young group (8 weeks) may still have spermatozoa from the first wave of spermatogenesis, and as such the observed differences between the 2 ages cannot be considered as aging-related but rather may represent different stages of maturation of the reproductive system. At the first glance this criticism looks valid. 

      However, to design our experiments we used our data that was not included to this manuscript initially. These data demonstrated that age dependent changes in sperm DNA are linearly or semi linearly associated with age in the age range from 56 to 334 days. Thus, within this interval any 2 ages, distant enough to register the difference in DNA methylation, can be used to assess age dependent changes in DNA methylation and changes in the rates of epigenetic aging of sperm in response to genetic manipulations. We have added these results now, - see “Identification of agedependent patterns in sperm DNA methylation” section in Material and Methods and “Patterns of age-dependent changes in sperm DNA methylation” in Results. We also consider that the reviewer’s suggestion that sperm from 8-week-old mice represents the first wave of spermatogenesis does not have ground. Indeed, C57BL/6 mice first have fertile sperm in cauda epididymis at 37 days of age [1], 19 days earlier than the age of 56 days (8 weeks) at which sperm was collected in our study in the youngest group of mice. Given that young C57BL/6 mice ejaculate spontaneously around 3 times per 5 days [2], 8 weeks old mice have ejaculated > 10 times since the first wave of spermatogenesis before the sperm was collected for our study, making negligibly small the chances of survival of any first wave sperm in their cauda epididymides to the age of 8 weeks. We have added this information to the text.

      (1) Mochida, K.; Hasegawa, A.; Ogonuki, N.; Inoue, K.; Ogura, A. Early Production of Offspring by in Vitro Fertilization Using First-Wave Spermatozoa from Prepubertal Male Mice. J. Reprod. Dev. 2019, 65, 467–473, doi:10.1262/jrd.2019-042.

      (2) Huber, M.H.; Bronson, F.H.; Desjardins, C. Sexual Activity of Aged Male Mice: Correlation with Level of Arousal, Physical Endurance, Pathological Status, and Ejaculatory Capacity. Biol. Reprod. 1980, 23, 305–316, doi:10.1095/biolreprod23.2.305.

    2. eLife assessment

      This potentially important study addresses the effects of aging on the sperm epigenome and its consequences for reproductive health. The evidence supporting the main claim remains incomplete. This study will be of interest to researchers working on aging and reproductive health.

    3. Reviewer #1 (Public Review):

      In the manuscript "Mechanistic target of rapamycin (mTOR) pathway in Sertoli cells regulates age-dependent changes in sperm DNA methylation", the authors proposed to test if the balance of mTOR complexes in Sertoli cells may play a significant role in age-dependent changes in the sperm epigenome. The paper could be of interest and has a good scientific aim but there are too many drawbacks that hamper the initial enthusiasm. All sections need extensive revision. The paper is mostly descriptive without a mechanistic-orientated explanation for the observed results.

      Comments on revised version:

      I am not sure that the authors have made an attempt to clearly answer the reviewers comments that aimed to improve the quality of the manuscript. It stands as mostly descriptive and with limited interest as it is.

    4. Reviewer #3 (Public Review):

      Summary and Strength:

      The manuscript by Amir et al. describes that Sertoli-specific inactivation of the mTORC1 and mTORC2 complex by KO of either Raptor or Rictor, respectively, resulted in progressive changes in blood-testis-barrier (BTB) function, testis weight, and sperm parameters, including counts, morphology, mtDNA content and sperm DNA methylation.

      The described studies are based on the hypothesis that a decline of BTB function with increasing chronological age of a male contributes to the DNA methylation changes that are known to occur in sperm DNA of old males when compared to sperm DNA from isogenic young males. In order to demonstrate the relevance of a functioning BTB for the maintenance of sperm methylation patterns, the authors generated mice with genetically disrupted mTORC2 complex or mTORC1 complex in Sertoli cells and determined sperm methylation patterns in comparison to isogenic wild-type males. In line with previously published scientific literature (e.g. Mok et al., 2013; Dong et al, 2015; and others), the manuscript corroborates that a Sertoli-cell specific deletion of mTORC2 caused a loss of BTB function and a progressive spermatogenic defect. The authors further show that sperm DNA is differentially methylated (DMRs) as a consequence of either a mTORC2 disruption (associated with a loss of BTB function) or following a mTORC1 disruption (BTB function either increased or not leaky) when compared to their isogenic age-matched wt controls. Those DMRs overlap partially with changes in sperm DNA methylation that were found when comparing sperm from 8-week males with sperm isolated from 22-week-old male mice.

      The authors interpret the observed changes as representative of the sperm DNA methylation changes that occur during normal chronological aging of the male. For an aged control group, the authors use sperm DNA of 22-week-old wild-type mates from the mTORC2 and mTORC2 KO breeding and compare the sperm methylation patterns found in sperm from those 22-week males to 8-week young males, that are intended to represent an old and a young cohort, respectively. DNA methylation analysis indicates that a disruption of mTORC2 (& decrease of BTB function) results in increased DNA methylation of sperm DNA, while a disruption of mTORC1 (and proposed increase of BTB tightness, not shown in the manuscript, though) resulted in increased hypomethylation.

      Weaknesses:

      While the hypothesis and experimental system are interesting and the data demonstrating the relevance of the mTORC2 complex for BTB function is convincing, several open questions limit the evidence that supports the hypothesis that the sperm DNA methylation changes seen in old males are caused by BTB failure following an imbalance of mTOR signaling complexes. The major critique points are the lack of a chronologically old group and the choice of 8 weeks & 22 weeks age of age:

      - Data illustrating the degree of BTB decline and sperm DNA methylation changes from chronologically "old" male mice is missing. 22-week-old mice are not considered old but are of good and mature breeding age, equivalent to humans in their mid-late twenties. (In the manuscript, the 22-week-old wildtype mice show no evidence of BTB breakdown (Figure 3), so why are their sperm used to represent "aged" sperm?

      - Adding a group of "old" wild-type mice of 12-14 months of age, which is closer to the end of effective reproduction in mice, more equivalent to 45-59 year-old humans) could be used to illustrate that (a) aging causes a marked decrease in BTB function at this time in mouse life, and that this BTB breakdown chronologically aligns with the age-associated DNA hypermethylation seen in old sperm. Age-matched "old" mTORC1 KO, with a (supposedly) tighter BTB barrier, could then be expected to have a sperm DMA methylation profile closer to that of younger wild-type animals. Such data are currently missing. While the progressive testicular decline observed in the mTORC1 KO (Fig.5) could make it difficult to obtain the appropriately aged mTORC1 KO tissues, it is completely feasible to obtain data from chronologically old wild-type males. (The progressive testicular decline further raises the question of what additional defects the KO causes, and how such additional defects would influence the sperm DNA methylation profile.) The addition of data from an old group to the currently included groups could strengthen the interpretation that the observations in the BTB-defective mTORC2 KO mice are modelling an age-related testicular decline, provided that the DMRs seen in the chronologically old group significantly overlap with the BTB-defective changes.

      - In the current form, the described differences in sperm DNA methylation are based on comparisons between pubertal mice (8 weeks) and mature but not old adult males (22 weeks), while a chronologically "old" group is missing from the data sets and comparisons. Thus, it appears that the described sperm methylation changes reflect developmental changes associated with normal maturation and not necessarily declining sperm quality due to aging. (Sperm obtained from 8-week-old mice likely were generated, at least in part, during the 1st wave of spermatogenesis, which is known to differ from the continuously proceeding spermatogenesis during the remained of the mature life. During the 1st wave of spermatogenesis, Sertoli cells are known to undergo gene expression changes which could contribute to varying degrees of BTB function, and thus have effects on the sperm DNA methylation profiles of such 1st wave sperm.)

      - It is unclear why the aging-related DMRs between the 8 and 22-week-old wild-type mice vary so dramatically between the two wild-type groups derived from the mTORC1 and the mTORC2 breeding (Fig. S4). If the main difference was due to mTORC1 or mTORC2 activity, both wildtype groups should behave very similarly. Changes seen in a truly "old" mouse (e.g. 20 weeks to 56 weeks), changes in "young mTORC1" and in "old mTORC2" are missing. How do those numbers and profiles compare to the shown samples?

      Comments on latest version:

      The rebuttal letter and public response indicate the authors' reluctance to consider the limitations of their study, i.e. having chosen chronologically young animals to demonstrate a sperm aging effect and indicate that they are not willing to include adequate controls.

      Since there is no evidence that mice at this young age have a deteriorating blood-testis-barrier (indeed, normal intact BTB is clearly visible in the figures included in this study from animals of the relevant age group), the whole central hypothesis that the study is built upon (i.e. that increasing age causes deteriorating BTB integrity which in turn causes age-related changes in sperm DNA methylation), appears irrelevant or invalid.

      The authors' claim that age-related DNA methylation changes in sperm occur in linear fashion and that the changes are somewhat proportional with chronological age is in stark contrast of the claim that a decline of the BTB in old animals is causative for age-related sperm epigenetic changes, putting the relevance of the whole study in question.

    1. eLife assessment

      This important study addresses how 3' splice site choice is modulated by the conserved spliceosome-associated protein Fyv6. The authors provide compelling evidence Fyv6 functions to enable selection of 3' splice sites distal to a branch point and in doing so antagonizes more proximal, suboptimal 3' splice sites. The study would be improved through a more nuanced discussion of alternative possibilities and models, for instance in discussing the phenotypic impact of Fyv6 deletion.

    2. Reviewer #1 (Public Review):

      Summary:

      A key challenge at the second chemical step of splicing is the identification of the 3' splice site of an intron. This requires recruitment of factors dedicated to the second chemical step of splicing and exclusion of factors dedicated to the first chemical step of splicing. Through the highest resolution cyroEM structure of the spliceosome to-date, the authors show the binding site for Fyv6, a factor dedicated to the second chemical step of splicing, is mutually exclusive with the binding site for a distinct factor dedicated to the first chemical step of splicing, highlighting that splicing factors bind to the spliceosome at a specific stage not only by recognizing features specific to that stage but also by competing with factors that bind at other stages. The authors further reveal that Fyv6 functions at the second chemical step to promote selection of 3' splice sites distal to a branch point and thereby discriminate against proximal, suboptimal 3' splice site. Lastly, the authors show by cyroEM that Fyv6 physically interacts with the RNA helicase Prp22 and by genetics Fyv6 functionally interacts with this factor, implicating Fyv6 in 3'SS proofreading and mRNA release from the spliceosome. The evidence for this study is robust, with the inclusion of genomics, reporter assays, genetics, and cyroEM. Further, the data overall justify the conclusions, which will be of broad interest.

      Strengths:

      (1) The resolution of the cryoEM structure of Fyv6-bound spliceosomes at the second chemical step of splicing is exceptional (2.3 Angstroms at the catalytic core; 3.0-3.7 Angstroms at the periphery), providing the best view of this spliceosomal intermediate in particular and the core of the spliceosome in general.<br /> (2) The authors observe by cryoEM three distinct states of this spliceosome, each distinguished from the next by progressive loss of protein factors and/or RNA residues. The authors appropriately refrain from overinterpreting these states as reflecting distinct states in the splicing cycle, as too many cyroEM studies are prone to do, and instead interpret these observations to suggest interdependencies of binding. For example, when Fyv6, Slu7, and Prp18 are not observed, neither are the first and second residues of the intron, which otherwise interact, suggesting an interdependence between 3' splice site docking on the 5' splice site and binding of these second step factors to the spliceosome.<br /> (3) Conclusions are supported from multiple angles.<br /> (4) The interaction between Fyv6 and Syf1, revealed by the cyroEM structure, was shown to account for the temperature-sensitive phenotypes of a fyv6 deletion, through a truncation analysis.<br /> (5) Splicing changes were observed in vivo both by indirect copper reporter assays and directly by RT-PCR.<br /> (6) Changes observed by RNA-seq are validated by RT-PCR.<br /> (7) The authors go beyond simply observing a general shift to proximal 3'SS usage in the fyv6 deletion by RNA-seq by experimentally varying branch point to 3' splice site distance experimentally in a reporter and demonstrating in a controlled system that Fyv6 promotes distal 3' splice sites.<br /> (8) The importance of the Fyv6-Syf1 interaction for 3'SS recognition is demonstrated by truncations of both Fyv6 and of Syf1.<br /> (9) In general, the study was executed thoroughly and presented clearly.

      Weaknesses:

      (1) Despite the authors restraint in interpreting the three states of the spliceosome observed by cyroEM as sequential intermediates along the splicing pathway, it would be helpful to the general reader to explicitly acknowledge the alternative possibility that the difference states simply reflect decomposition from one intermediate during isolation of the complex (i.e., the loss of protein is an in vitro artifact, if an informative one).<br /> (2) The authors acknowledge that for prp8 suppressors of the fyv6 deletion, suppression may be indirect, as originally proposed by the Query and Konarska labs - that is, that defects in the second step conformation of the spliceosome can be indirectly suppressed by compensating, destabilizing mutations in the first step spliceosome. Whereas some of the other suppressors of the fyv6 deletion can be interpreted as impacting directly the second step spliceosome (e.g., because the gene product is only present in the second step conformation), it seems that many more suppressors beyond prp8 mutants, especially those corresponding to bulky substitutions, which would more likely destabilize than stabilize, could similarly act indirectly by destabilization of first step conformation. The authors should acknowledge this where appropriate (e.g., for factors like Prp8 that are present in both first and second step conformations).

    3. Reviewer #2 (Public Review):

      In this manuscript, Senn, Lipinski, and colleagues report on the structure and function of the conserved spliceosomal protein Fyv6. Pre-mRNA splicing is a critical gene expression step that occurs in two steps, branching and exon ligation. Fyv6 had been recently identified by the Hoskins' lab as a factor that aids exon ligation (Lipinski et al., 2023), yet the mechanistic basis for Fyv6 function was less clear. Here, the authors combine yeast genetics, transcriptomics, biochemical assays, and structural biology to reveal the function of Fyv6. Specifically, they describe that Fyv6 promotes the usage of distal 3'SSs by stabilizing a network of interactions that include the RNA helicase PRP22 and the spliceosome subunit SYF1. They discuss a generalizible mechanism for splice site proofreading by spliceosomsal RNA helicases that could be modulated by other, regulatory splicing factors.

      This is a very high quality study, which expertly combines various approaches to provide new insights into the regulation of 3'SS choice, docking, and undocking. The cryo-EM data is also of excellent quality, which substantially extends on previous yeast P complex structures. This is also supported by the authors use of the latest data analysis tools (Relion-5, AlphaFold2 multimer predictions, Modelangelo). The authors re-evaluate published EM densities of yeast spliceosome complexes (B*, C,C*,P) for the presence or absence of Fyv6, substantiate Fyv6 as a 2nd step specific factor, confirm it as the homolog of the human protein FAM192A, and provide a model for how Fyv6 may fit into the splicing pathway. The biochemical experiments on probing the splicing effects of BP to 3'SS distances after Fyv6 KO, genetic experiments to probe Fyv6 and Syf1 domains, and the suppressor screening add substantially to the study and are well executed. The manuscript is clearly written and we particularly appreciated the nuanced discussions, for example for an alternative model by which Prp22 influences 3'SS undocking. The research findings will be of great interest to the pre-mRNA splicing community.

      We have only few comments to improve an already strong manuscript.

      Comments:

      (1) Can the authors comment on how they justify K+ ion positions in their models (e.g. the K+ ion bridging G-1 and G+1 nucleotides)? How do they discriminate e.g. in the 'G-1 and G+1' case K+ from water?<br /> (2) The authors comment on Yju2 and Fyv6 assignments in all yeast structures except for the ILS. Can the authors comment on if they have also looked into the assignment of Yju2 in the yeast ILS structure in the same manner? While it is possible that Fyv6 could dissociate and Yju2 reassociate at the P to ILS transition, this would merit a closer look given that in the yeast P complex Yju2 had been misassigned previously.<br /> (3) For accessibility to a general reader, figures 1c, d, e, 2a, b, would benefit from additional headings or labels, to immediately convey what is being displayed. It is also not clear to us if Fig 1e might fit better in the supplement and be instead replaced by Supplementary Figure 1a (wt) , b (delta upf1), and a new c (delta fyv6) and new d (delta upf1, delta fyv6). This may allow the reader to better follow the rationale of the authors' use of the Fyv6/Upf1 double deletion.<br /> (4) The authors carefully interpret the various suppressor mutants, yet to a general reader the authors may wish to focus this section on only the most critical mutants for a better flow of the text.

    4. Reviewer #3 (Public Review):

      In this manuscript the authors expand their initial identification of Fyv6 as a protein involved in the second step of pre-mRNA splicing to investigate the transcriptome-wide impact of Fyv6 on splicing and gain a deeper understanding of the mechanism of Fyv6 action.

      They first use deep sequencing of transcripts in cells depleted of Fyv6 together with Upf1 (to limit loss of mis-spliced transcripts) to identify broad changes in the transcriptome due to loss of Fyv6. This includes both changes in overall gene expression, that are not deeply discussed, as well as alterations in choice of 3' splice sites - which is the focus of the rest of the manuscript

      They next provide the highest resolution structure of the post-catalytic spliceosome to date; providing unparalleled insight into details of the active site and peripheral components that haven't been well characterized previously.

      Using this structure they identify functionally critical interactions of Fyv6 with Syf1 but not Prp22, Prp8 and Slu7. Finally, a suppressor screen additionally provides extensive new information regarding functional interactions between these second step factors.

      Overall this manuscript reports new and essential information regarding molecular interactions within the spliceosome that determine the use of the 3' splice site. It would be helpful, especially to the non-expert, to summarize these in a table, figure or schematic in the discussion.

    5. Author response:

      We thank the editors and reviewers for their enthusiasm for this work and helpful suggestions. In summary, the reviewers provided suggestions for additional discussion items and clarifications for the text and figures, especially in relation to the cryo-EM structures and suppressor screen sections of the manuscript. We will consider each of these and make edits as needed. In particular, reviewers asked for further details about the structural model in addition to analysis of our new structure with respect to previously reported intron lariat spliceosome (ILS) complexes. For the latter point, we present additional evidence for the correct assignment of Yju2 in the S. cerevisiae ILS structure and note that docking of the 3’ splice site is not observed in any ILS structure from yeast, worms, or humans. This is consistent with our proposed mechanism. We will clarify these points in the text as well highlight some caveats of prior studies of the ILS complex. We feel that these changes will add additional nuance to the manuscript as well as clarify the findings and their context and significance for the reader.

    1. eLife assessment

      This valuable study provides integrated analyses of RNA sequencing and mapping data of the m6A RNA modification in the context of unbalanced genomes, using aneuploid Drosophila as a model, and suggests that the dosage compensation complex and m6A act in a feedback loop. The evidence is incomplete due to technical concerns, as quantitative assessments are being made using non-quantitative methods, and the study would be improved by further functional studies. If strengthened, the study will be of interest to RNA and developmental biologists.

    2. Reviewer #1 (Public Review):

      Summary:

      This study sought to reveal the potential roles of m6A RNA methylation in gene dosage regulatory mechanisms, particularly in the context of aneuploid genomes in Drosophila. Specifically, this work looked at the relationships between the expression of m6A regulatory factors, RNA methylation status, classical and inverse dosage effects, and dosage compensation. Using RNA sequencing and m6A mapping experiments, an in-depth analysis was performed to reveal changes in m6A status and expression changes across multiple aneuploid Drosophila models. The authors propose that m6A methylation regulates MOF and, in turn, deposition of H4K16Ac, critical regulators of gene dosage in the context of genomic imbalance.

      Strengths:

      This study seeks to address an interesting question with respect to gene dosage regulation and the possible roles of m6A in that process. Previous work has linked m6A to X-inactivation in humans through the Xist lncRNA, and to the regulation of the Sxl in flies. This study seeks to broaden that understanding beyond these specific contexts to more broadly understand how m6A impacts imbalanced genomes in other contexts.

      Weaknesses:

      The methods being used particularly for analysis of m6A at both the bulk and transcript-specific level are not sufficiently specific or quantitative to be able to confidently draw the conclusions the authors seek to make. MeRIP m6A mapping experiments can be very valuable, but differential methylation is difficult to assess when changes are small (as they often are, in this study but also m6A studies more broadly). For instance, based on the data presented and the methods described, it is not clear that the statement that "expression levels at m6A sites in aneuploidies are significantly higher than that in wildtype" is supported. MeRIP experiments are not quantitative, and since there are far fewer peaks in aneuploidies, it stands to reason that more antibody binding sites may be available to enrich those fewer peaks to a larger extent. But based on the data as presented (figure 2D) this conclusion was drawn from RPKM in IP samples, which may not fully account for changing transcript abundances in absolute (expression level changes) and relative (proportion of transcripts in input RNA sample) terms.

      The bulk-level m6A measurements as performed here also cannot effectively support these conclusions, as they are measured in total RNA. The focus of the work is mRNA m6A regulators, but m6A levels measured from total RNA samples will not reflect mRNA m6A levels as there are other abundance RNAs that contain m6A (including rRNA). As a result, conclusions about mRNA m6A levels from these measurements are not supported.

    3. Reviewer #2 (Public Review):

      Summary:

      The authors have tested the effects of partial- or whole-chromosome aneuploidy on the m6A RNA modification in Drosophila. The data reveal that overall m6A levels trend up but that the number of sites found by meRIP-seq trend down, which seems to suggest that aneuploidy causes a subset of sites to become hyper-methylated. Subsequent bioinformatic analysis of other published datasets establish correlations between the activity of the H4K16 acetyltransferase dosage compensation complex (DCC) and the expression of m6A components and m6A abundance, suggesting that DCC and m6A can act in a feedback loop on each other. Overall, this paper uses bioinformatic trends to generate a candidate model of feedback between DCC and m6A. It would be improved by functional studies that validate the effect in vivo.

      Strengths:

      • Thorough bioinformatic analysis of their data.

      • Incorporation of other published datasets that enhance scope and rigor.

      • Finds trends that suggest that a chromosome counting mechanism can control m6A, as fits with pub data that the Sxl mRNA is m6A modified in XX females and not XY males.

      • Suggests this counting mechanism may be due to the effect of chromatin-dependent effects on the expression of m6A components.

      Weaknesses:

      • The linkage between H4K16 machinery and m6A is indirect and based on bioinformatic trends with little follow-up to test the mechanistic bases of these trends.

      • The paper lacks sufficient in vivo validation of the effects of DCC alleles on m6A and vice versa. For example, Is the Ythdc1 genomic locus a direct target of the DCC component Msl-2 ? (see Figure 7).

      • Quite a bit of technical detail is omitted from the main text, making it difficult for the reader to interpret outcomes.

      (1) Please add the tissues to the labels in Figure 1D.

      (2) In the main text, please provide detail on the source tissues used for meRIP; was it whole larvae? adult heads? Most published datasets are from S2 cells or adult heads and comparing m6A across tissues and developmental stages could introduce quite a bit of variability, even in wt samples. This issue seems to be what the authors discuss in lines 197-199.

      (3) In the main text, please identify the technique used to measure "total m6A/A" in Fig 2A. I assume it is mass spec.

      (4) Line 190-191: the text describes annotating m6A sites by "nearest gene" which is confusing. The sites are mapped in RNAs, so the authors must unambiguously know the identity of the gene/transcript, right?

    4. Author response:

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This study sought to reveal the potential roles of m6A RNA methylation in gene dosage regulatory mechanisms, particularly in the context of aneuploid genomes in Drosophila. Specifically, this work looked at the relationships between the expression of m6A regulatory factors, RNA methylation status, classical and inverse dosage effects, and dosage compensation. Using RNA sequencing and m6A mapping experiments, an in-depth analysis was performed to reveal changes in m6A status and expression changes across multiple aneuploid Drosophila models. The authors propose that m6A methylation regulates MOF and, in turn, deposition of H4K16Ac, critical regulators of gene dosage in the context of genomic imbalance.

      Strengths:

      This study seeks to address an interesting question with respect to gene dosage regulation and the possible roles of m6A in that process. Previous work has linked m6A to X-inactivation in humans through the Xist lncRNA, and to the regulation of the Sxl in flies. This study seeks to broaden that understanding beyond these specific contexts to more broadly understand how m6A impacts imbalanced genomes in other contexts.

      Weaknesses:

      The methods being used particularly for analysis of m6A at both the bulk and transcript-specific level are not sufficiently specific or quantitative to be able to confidently draw the conclusions the authors seek to make. MeRIP m6A mapping experiments can be very valuable, but differential methylation is difficult to assess when changes are small (as they often are, in this study but also m6A studies more broadly). For instance, based on the data presented and the methods described, it is not clear that the statement that "expression levels at m6A sites in aneuploidies are significantly higher than that in wildtype" is supported. MeRIP experiments are not quantitative, and since there are far fewer peaks in aneuploidies, it stands to reason that more antibody binding sites may be available to enrich those fewer peaks to a larger extent. But based on the data as presented (figure 2D) this conclusion was drawn from RPKM in IP samples, which may not fully account for changing transcript abundances in absolute (expression level changes) and relative (proportion of transcripts in input RNA sample) terms.

      Methylated RNA immunoprecipitation followed by sequencing (MeRIP-seq) is a commonly used strategy of genome-wide mapping of m6A modification. This method uses anti-m6A antibody to immunoprecipitate RNA fragments, which results in selective enrichment of methylated RNA. Then the RNA fragments were subjected to deep sequencing, and the regions enriched in the immunoprecipitate relative to input samples are identified as m6A peaks using the peak calling algorithm. We identified m6A peaks in different samples by the exomePeak2 program and determined common m6A peaks for each genotype based on the intersection of biological replicates. Figure 2D shows the RPM values of m6A peaks in MeRIP samples for each genotype, indicating that the levels of reads in the m6A peak regions were significantly higher in the aneuploid IP samples than in wildtypes. When the enrichment of IP samples relative to Input samples (RPM.IP/RPM.Input) was taken into account, the statistics for all three aneuploidies were still significantly higher than those of the wildtypes (Mann Whitney U test p-values < 0.001). This analysis is not about changes in the abundance of transcripts, but from the MeRIP perspective, showing that there are relatively more m6A-modified reads mapped to the m6A peaks in aneuploidies than that in wildtypes. In addition, we have added the results of IP/Input in the main text, and revised the description in the manuscript to make it more precise to reduce possible misunderstandings.

      The bulk-level m6A measurements as performed here also cannot effectively support these conclusions, as they are measured in total RNA. The focus of the work is mRNA m6A regulators, but m6A levels measured from total RNA samples will not reflect mRNA m6A levels as there are other abundance RNAs that contain m6A (including rRNA). As a result, conclusions about mRNA m6A levels from these measurements are not supported.

      According to some published articles, m6A levels of purified mRNA or total RNA can be detected by different methods (such as mass spectrometry, 2D thin-layer chromatography, etc.) in Drosophila cells or tissues [1-3].

      Here, we used the EpiQuik m6A RNA Methylation Quantification Kit (Colorimetric) (Epigentek, NY, USA, Cat # P-9005), which is suitable for detecting m6A methylation status directly using total RNA isolated from any species such as mammals, plants, fungi, bacteria, and viruses. This kit has previously been used by researchers to detect the m6A/A ratio in total RNA [4, 5] or purified mRNA [6] from different species.

      In order to compare the m6A levels between the total RNA and mRNA, it was shown that the enrichment of mRNA from total RNA using Dynabeads™mRNA Purification Kit (Invitrogen Cat # 61006) did not show any significantly differences comparing with the results of total RNA (Figure 1). That’s the reason why most of the results of m6A levels in the manuscript were detected in total RNA.

      Author response image 1.

      The m6A levels of total RNA and mRNA

      As suggested, we will try to extract and purify mRNA from different genotypes to verify our conclusion based on the m6A levels of total RNA if necessary. In addition, m6A modification in other types of RNA other than mRNA (e.g., lncRNA, rRNA) is not necessarily meaningless. We will also add discussions of this issue in the manuscript.

      (1) Lence T, et al. (2016) m6A modulates neuronal functions and sex determination in Drosophila. Nature 540(7632):242-247.

      (2) Haussmann IU, et al. (2016) m(6)A potentiates Sxl alternative pre-mRNA splicing for robust Drosophila sex determination. Nature 540(7632):301-304.

      (3) Kan L, et al. (2017) The m(6)A pathway facilitates sex determination in Drosophila. Nat Commun 8:15737.

      (4) Zhu C, et al. (2023) RNA Methylome Reveals the m(6)A-mediated Regulation of Flavor Metabolites in Tea Leaves under Solar-withering. Genomics Proteomics Bioinformatics 21(4):769-787.

      (5) Song H, et al. (2021) METTL3-mediated m(6)A RNA methylation promotes the anti-tumour immunity of natural killer cells. Nat Commun 12(1):5522.

      (6) Yin H, et al. (2021) RNA m6A methylation orchestrates cancer growth and metastasis via macrophage reprogramming. Nat Commun 12(1):1394.

      Reviewer #2 (Public Review):

      Summary:

      The authors have tested the effects of partial- or whole-chromosome aneuploidy on the m6A RNA modification in Drosophila. The data reveal that overall m6A levels trend up but that the number of sites found by meRIP-seq trend down, which seems to suggest that aneuploidy causes a subset of sites to become hyper-methylated. Subsequent bioinformatic analysis of other published datasets establish correlations between the activity of the H4K16 acetyltransferase dosage compensation complex (DCC) and the expression of m6A components and m6A abundance, suggesting that DCC and m6A can act in a feedback loop on each other. Overall, this paper uses bioinformatic trends to generate a candidate model of feedback between DCC and m6A. It would be improved by functional studies that validate the effect in vivo.

      Strengths:

      • Thorough bioinformatic analysis of their data.

      • Incorporation of other published datasets that enhance scope and rigor.

      • Finds trends that suggest that a chromosome counting mechanism can control m6A, as fits with pub data that the Sxl mRNA is m6A modified in XX females and not XY males.

      • Suggests this counting mechanism may be due to the effect of chromatin-dependent effects on the expression of m6A components.

      Weaknesses:

      • The linkage between H4K16 machinery and m6A is indirect and based on bioinformatic trends with little follow-up to test the mechanistic bases of these trends.

      We found a set of ChIP-seq data (GSE109901) of H4K16ac in female and male Drosophila larvae from the public database, and analyzed whether H4K16ac is directly associated with m6A regulator genes. ChIP-seq is a standard method to study transcription factor binding and histone modification by using efficient and specific antibodies for immunoprecipitation. The results showed that there were H4K16ac peaks at the 5' region in gene of m6A reader Ythdc1 in both males and females. In addition, most of the genome sites where the other m6A regulator genes located are acetylated at H4K16 in both sexes, except that Ime4 shows sexual dimorphism and only contains H4K16ac peak in females. These results indicate that the m6A regulator gene itself is acetylated at H4K16, so there is a direct relationship between H4K16ac and m6A regulators. We have added these contents to the text.

      Besides the above conclusion from the seq data, we are also going to do some experiments to test the linkage between H4K16 and m6A in the next, such as how about the m6A levels when MOF is over expressed with the increased levels of H4K16Ac, the H4K16 levels when YT521B is knocked down or over expressed and the relative expression levels of important regulatory genes in there.

      • The paper lacks sufficient in vivo validation of the effects of DCC alleles on m6A and vice versa. For example, Is the Ythdc1 genomic locus a direct target of the DCC component Msl-2 ? (see Figure 7).

      In order to study whether Ythdc1 genomic locus is a direct target of DCC component, we first analyzed a published MSL2 ChIP-seq data of Drosophila (GSE58768). Since MSL2 is only expressed in males under normal conditions, this set of data is from male Drosophila. According to the results, the majority (99.1%) of MSL2 peaks are located on the X chromosome, while the MSL2 peaks on other chromosomes are few. This is consistent with the fact that MSL2 is enriched on the X chromosome in male Drosophila [1, 2]. Ythdc1 gene is located on chromosome 3L, and there is no MSL2 peak near it. Similarly, other m6A regulator genes are not X-linked, and there is no MSL2 peak. Then we analyzed the MOF ChIP-seq data (GSE58768) of male Drosophila. It was found that 61.6% of MOF peaks were located on the X chromosome, which was also expected [3, 4]. Although there are more MOF peaks on autosomes than MSL2 peaks, MOF peaks are absent on m6A regulator genes on autosomes. Therefore, at present, there is no evidence that the gene locus of m6A regulators are the direct targets of DCC component MSL2 and MOF, which may be due to the fact that most MSL2 and MOF are tethered to the X chromosome by MSL complex under physiological conditions. Whether there are other direct or indirect interactions between Ythdc1 and MSL2 is an issue worthy of further study in the future.

      (1) Bashaw GJ & Baker BS (1995) The msl-2 dosage compensation gene of Drosophila encodes a putative DNA-binding protein whose expression is sex specifically regulated by Sex-lethal. Development 121(10):3245-3258.

      (2) Kelley RL, et al. (1995) Expression of msl-2 causes assembly of dosage compensation regulators on the X chromosomes and female lethality in Drosophila. Cell 81(6):867-877.

      (3) Kind J, et al. (2008) Genome-wide analysis reveals MOF as a key regulator of dosage compensation and gene expression in Drosophila. Cell 133(5):813-828.

      (4) Conrad T, et al. (2012) The MOF chromobarrel domain controls genome-wide H4K16 acetylation and spreading of the MSL complex. Dev Cell 22(3):610-624.

      Quite a bit of technical detail is omitted from the main text, making it difficult for the reader to interpret outcomes.

      (1) Please add the tissues to the labels in Figure 1D.

      Figure 1D shows the subcellular localization of FISH probe signals in Drosophila embryos. Arrowheads indicate the foci of probe signals. The corresponding tissue types are (1) blastoderm nuclei; (2) yolk plasm and pole cells; (3) brain and midgut; (4) salivary gland and midgut; (5) blastoderm nuclei and yolk cortex; (6) blastoderm nuclei and pole cells; (7) blastoderm nuclei and yolk cortex; (8) germ band. We have added these to the manuscript.

      (2) In the main text, please provide detail on the source tissues used for meRIP; was it whole larvae? adult heads? Most published datasets are from S2 cells or adult heads and comparing m6A across tissues and developmental stages could introduce quite a bit of variability, even in wt samples. This issue seems to be what the authors discuss in lines 197-199.

      In this article, the material used to perform MeRIP-seq was the whole third instar larvae. Because trisomy 2L and metafemale Drosophila died before developing into adults, it was not possible to use the heads of adults for MeRIP-seq detection of aneuploidy. For other experiments described here, the m6A abundance was measured using whole larvae or adult heads; material used for RT-qPCR analysis was whole larvae, larval brains, or adult heads; Drosophila embryos at different developmental stages were used for fluorescence in situ hybridization (FISH) experiments. We provide a detailed description of the experimental material for each assay in the manuscript.

      (3) In the main text, please identify the technique used to measure "total m6A/A" in Fig 2A. I assume it is mass spec.

      We used the EpiQuik m6A RNA Methylation Quantification Kit (Colorimetric) (Epigentek, NY, USA, Cat # P-9005) to measure the m6A/A ratio in RNA samples. This kit is commercially available for quantification of m6A RNA methylation, which used colorimetric assay with easy-to-follow steps for convenience and speed, and is suitable for detecting m6A methylation status directly using total RNA isolated from any species such as mammals, plants, fungi, bacteria, and viruses.

      (4) Line 190-191: the text describes annotating m6A sites by "nearest gene" which is confusing. The sites are mapped in RNAs, so the authors must unambiguously know the identity of the gene/transcript, right?

      When the m6A peaks were annotated using the R package ChIPseeker, it will include two items: "genomic annotation" and "nearest gene annotation". "Genomic annotation" tells us which genomic features the peak is annotated to, such as 5’UTR, 3’UTR, exon, etc. "Nearest gene annotation" indicates which specific gene/transcript the peak is matched to. We modified the description in the main text to make it easier to understand.

    1. eLife assessment

      This important work substantially advances our understanding of the molecular mechanisms underlying the timing of the initiation of metamorphosis of the Ciona ascidian tadpole larva. Through the combination of gene knockdown experiments and fluorescent molecular reporters the authors provide compelling evidence about a crosstalk between different G protein mediated signalling pathways and are able to place different signalling molecules within a signalling network. The work will be of interest to molecular, developmental and marine biologists and to scientists working on animal metamorphosis.

    2. Reviewer #1 (Public Review):

      Summary:<br /> In this manuscript, the authors use gene functional analysis, pharmacology and live imaging to develop a proposed model of diverse G protein family signalling that takes place in the papillae during the ascidian Ciona larval adhesion to regulate the timing of initiation of the morphological changes of metamorphosis. Their experiments provide solid evidence that antagonistic G protein signalling regulates cAMP levels in the papillae, which provides a threshold for triggering metamorphosis that is reflective of a larva keeping a strong and sustained level of contact with a substrate for a minimum period of approximately half an hour. The authors discuss their reasoning and address different specific aspects of their proposed timing mechanism to provide a logical flow to the manuscript. The results are nicely linked to<br /> the ecology of Ciona larval settlement and will be of interest to developmental biologists, neurobiologists, molecular biologists, marine biologists as well as provide information relevant to antifouling and aquaculture sectors.

      First, they knock down the G proteins Gaq and Gas to show that these genes are important for Ciona larval metamorphosis. They then provide evidence that the Gaq protein acts through a Ca2+ pathway mediated by phospholipase C and inositol triphosphate by showing that inositol phosphate and phospholipase C gene knockdown also inhibits metamorphosis, while overexpression of Gaq or phospholipase C allows larvae to undergo metamorphosis even in the absence of their mechanosensory cue, which is deprived by removing the posterior half of the tail and culturing the larvae on agar-coated dishes. The authors used calcium imaging which is a genetically encoded fluorescent calcium sensor to show that Gq knockdown larvae lack a Ca2+ spike in their papillae after mechanostimulation, confirming that Gaq acts through a Ca2+ pathway. Similarly the authors show that overexpression of Gas also enables larvae to metamorphose in the absence of mechanostimulation, suggesting a role for both Gaq and Gas in this process.

      To confirm that Gas acts through cAMP signalling, the authors use pharmacological treatment or overexpression of a photoactivating adenylate cyclase to increase cAMP, and show that this also enables larvae to metamorphose in the absence of mechanostimulation, but only<br /> when their adhesive papillae are still present. Transcriptome data indicate that both Gs and Gq pathway genes are expressed in the adhesive papillae of the Ciona larva. One missing detail seems to be the need for evidence that cAMP is elevated in the papillae directly as a result of Gs activation. The authors use a fluorescent cAMP indicator, Pink Flamindo, to show that cAMP increases in the papillae upon adhesion to a substrate. Complementary to this, larvae that fail to undergo metamorphosis lack a cAMP increase in papillae. However, it is unclear whether the measured larvae that failed to undergo metamorphosis were wildtype or Gas knockdown larvae. If they were Gas knockdown larvae, this could provide evidence that cAMP does act downstream of the Gas activation.

      The authors then provide evidence that GABA signalling within the papillae is acting downstream of the G proteins to induce metamorphosis. Transcriptome data shows that the genes for the GABA-producing enzyme, and for GABAb receptors, are both expressed in papillae. Pharmacological experiments show that GABA induces metamorphosis in the absence of mechanosensory cues, but only in larvae that retain their papillae. To show that GABA signalling within the papillae, rather than from the brain of the larva is important, the authors also demonstrate that anterior segments of larvae lacking the brain can also be stimulated to metamorphose by GABA, and show changes in gene expression caused by GABA.

      The authors then use a combination of pharmacology and knockdown experiments in the presence or absence of mechanosensory cues to show that Gq/Ca2+ signalling acts upstream of Gs/cAMP signalling. As the elevation of cAMP by pharmacology or photoactivating adenylate cyclase rescued GABA pathway mutant larvae, the Gq and Gs pathways were concluded to be downstream of GABA signaling. However, GABA treatment could still induce Gaq- and Gas-knockdown larvae to metamorphose, suggesting an alternative pathway to metamorphosis, which the authors deduce to be through a third G protein, Gai. They identify an unusual Gai protein that based on transcriptome data is strongly expressed in the papillae. Gai knockdown larvae fail to metamorphose but are rescued by GABA treatment, which can be explained by a potential additional Gai protein being still present (transcriptome evidence suggests this although it is not further confirmed experimentally, for example by hybridization, immunohistochemistry, fluorescent labelling, or knockdown). The authors then use overexpression and knockdown experiments to show that the Gai protein acts through Gβγi complex to activate phospholipase C. Their experiments also indicate a potential for a complementary or compensatory role for Gai and Gaq signalling through Gβγi. By inhibiting the potassium channel GIRK through knockdown, and the MAPK pathway gene MEK1/2 by pharmacology, the authors also establish a role for these in their proposed model of signalling, allowing GABA and cAMP to compensate or interact with each other.

      Strengths:<br /> The strength of this paper is the meticulous and extensive experiments, which are carefully designed to be able to precisely target specific genes in the putative signalling pathway to build step by step a complex model that can demonstrate how metamorphosis of the ascidian larva is timed so as to only undergo metamorphosis when strongly attached to a<br /> suitable substrate. The unique possibility of inhibiting mechanosensory-induced metamorphosis by removing some of the tail and smoothing the attachment substrate allows the authors to investigate potential effects on both activation and inhibition of metamorphosis, and to confirm that specific signalling pathways are clearly downstream of the initial<br /> mechanosensory stimulation. The study is also clear about which aspects of the model still remain unknown, such as which ligands and receptors may be responsible for the binding and activation of Gaq and Gas. Experiments testing metamorphosis of just the anterior region of the larvae nicely demonstrates the need for signalling in the region of the papillae, as do experiments where the papillae are removed, which then block metamorphosis in treatments that would otherwise stimulate it. The final model is a nice end point and makes a clear summary of how the extensive experiments all fit together into a cohesive potential signalling network, which can be built upon in the future.

      Weaknesses:<br /> The paper has few weaknesses, however the main difficulty it poses is that due to the sheer number of precise experiments carried out and the complexity of the interwoven signalling pathways, it quickly becomes very difficult to follow exactly what is going on when and why or to keep track of the story as it develops. To improve this, an initial section in the results could be included showing a summary of the known G proteins in Ciona, their types and potential downstream signalling or upstream receptors, where known, and their expression levels in papillae. This could be in the form of a table and/or include the phylogenetic tree from the supplementary data. This would help clarify why the study first focuses on Gaq and Gas, and only later looks at Gai. This could be supplemented by a schematic workflow giving an overview of the experimental process of the study. A second minor weakness (understandable as the focus of the study is metamorphosis induced by mechanosensory stimulation) is that the study does not take into account any potential role for other types of sensory modalities (light, chemicals) that may also feed into the regulation of Ciona larval metamorphosis. This aspect would be interesting to discuss in light of the recent paper suggesting that some sensory cells in the Ciona adhesive papillae are polymodal and detect both chemicals and mechanical stimuli (Hoyer et al. 2024 Current Biology 34(6): 1168 - 1182).

    3. Reviewer #2 (Public Review):

      Summary:<br /> This work aims to characterize the neural signaling cascade underlying the initiation of metamorphosis in Ciona larvae. Combining gene-specific functional analyses, pharmacological experiments, and live imaging approaches, the authors identify the molecular players downstream of GABA to initiate Ciona metamorphosis. The results of this study may serve as a useful framework for future research on animal metamorphosis.

      Strengths:<br /> The authors did a great job in connecting their experiments with previous findings on Ciona metamorphosis. Taking advantage of the Ciona model system, they meticulously conducted genetic manipulation and pharmacological experiments to test the epistatic relationships among the signaling players controlling the initiation of Ciona metamorphosis.

      Weaknesses:<br /> The causal relationship between cAMP accumulation and the initiation of metamorphosis was not clearly demonstrated by the life-imaging observation with the fluorescent cAMP indicator (Pink Flamindo). It is a pity that this experiment was only conducted using normal larvae to compare those who underwent metamorphosis versus those who failed to initiate metamorphosis. This approach should be applied to some of the genetic manipulation and pharmacological experiments, to strengthen their main thesis on the "cAMP timer" mechanism.<br /> On several occasions, the interpretation of the results seems to be imprecise and may lead to misunderstanding. This should be improved by rewriting the descriptions of those results and carefully comparing the differences in results from various treatments and experiments.

    4. Author response:

      We would like to thank all reviewers for their valuable comments that help us to improve our manuscript. We will make the following modifications in the revised manuscript:

      (1) To reduce the complexity of the experiments we carried out, we will summarize trimeric G proteins in Ciona in the first paragraph of the Result section and explain how we focused on Gas and Gaq in the initial phase of this study.

      (2) As the reviewer 1 suggested, the polymodal roles of papilla neurons are interesting. We will add a discussion regarding this aspect. The sentences will be like the following:

      “The recent study (Hoyer et al., 2024) provided several lines of evidence suggesting that papilla neurons can serve as the sensors of several chemicals in addition to the mechanical stimuli. This finding and our model seem mutually related because these chemicals could modify Ca2+ and cAMP signaling. The use of G protein signaling may allow Ciona to reflect various environmental stimuli to initiate metamorphosis in the appropriate situation, both mechanically and chemically.”

      (3) As both reviewers suggested, imaging cAMP on the backgrounds of some G protein knockdowns and pharmacological treatments is important, and we will carry out some of these experiments.

      (4) According to reviewer 2's comment, we will carefully modify the text about interpreting the results so that the descriptions suitably reflect the results.

    1. Author response:

      Response to reviewers (Public review):

      We thank all the three reviewers for their opinion on our work on Candida albicans β-1,6-glucan, which highlights the importance of this cell wall component in the biology of fungi. Here are our responses to their comments for public reviews:

      (1) Indeed, the data presented for immunological studies is preliminary. It has been acknowledged by the reviewers that our analysis providing insights into the biosynthetic pathways involved in comprehensive in dealing with organization and dynamics of the β-1,6-glucan polymer in relation with other cell wall components and environmental conditions (temperature, stress, nutrient availability, etc.). However, we anticipated that there would be immediate curiosity as to what the immunological contribution of β-1,6 glucan and we therefore felt we needed to initiative these studies and include them. We therefore performed immunological studies to assess whether β-1,6-glucans act as a pathogen-associated molecular pattern (PAMP), and if so, what its immunostimulatory potential is. Our data clearly suggest that β-1,6-glucan is a PAMP, and consequently lead to several questions: (a) what are the host immune receptors involved in the recognition of this polysaccharide, and thereby the downstream signaling pathways, (b) how is β-1,6-glucan differentially recognized by the host when C. albicans switches from a commensal to an opportunistic pathogen, and (c) how does the host environment impact the exposure of this polysaccharide on the fungal surface. We believe addressing these questions is beyond the scope of the present manuscript and aim to present new data in future manuscript. Nonetheless, in the revised manuscript, suggest approaches that we can take to identify the receptor that could be involved in the recognition of β-1,6-glucan. Moreover, we have modified the discussion presenting it based on the data rather than being descriptive.    

      (2) It will be interesting to assess the organization of β-1,6-glucan and other cell wall components in the opaque cells. It is documented that the opaque cells are induced at acidic pH and in the presence of N-acetylglucosamine and CO2. Our data shows that pH has an impact on β-1,6-glucan, which suggests that there will be differential organization of this polysaccharide in the cell wall of opaque cells. As suggested by the reviewer, we will include analysis of opaque cells (and other C. albicans cell types) in future studies.

      With the exception of these major new avenues for this research, our revision can address each of the comments provided by the reviewers.

    2. eLife assessment

      The paper will be of broad interest to fungal biologists and fungal immunologists seeking to understand the biosynthesis of the fungal cell wall, in particular of ß-1,6-glucan synthesis and the importance of this so far understudied constituent of the cell wall for cell wall integrity and immune response. The study is of fundamental significance and adds structural clarity to the presence, genetic, and biochemical basis of this difficult-to-analyze carbohydrate. It opens the potential for understanding its role in immune recognition and potentially as a drug target. Overall, the data is compelling, properly controlled and analyzed, but a few aspects need to be reconsidered.

    3. Reviewer #1 (Public Review):

      Summary:

      The fungal cell wall is a very important structure for the physiology of a fungus but also for the interaction of pathogenic fungi with the host. Although a lot of knowledge on the fungal cell wall has been gained, there is a lack of understanding of the meaning of ß-1,6-glucan in the cell wall. In the current manuscript, the authors studied in particular this carbohydrate in the important human-pathogenic fungus Candida albicans. The authors provide a comprehensive characterization of cell wall constituents under different environmental and physiological conditions, in particular of ß-1,6-glucan. Also, β-1,6-glucan biosynthesis was found to be likely a compensatory reaction when mannan elongation was defective. The absence of β-1,6-glucan resulted in a significantly sick growth phenotype and complete cell wall reorganization. The manuscript contains a detailed analysis of the genetic and biochemical basis of ß-1,6-glucan biosynthesis which is apparently in many aspects similar to yeast. Finally, the authors provide some initial studies on the immune modulatory effects of ß-1,6-glucan.

      Strengths:

      The findings are very well documented, and the data are clear and obtained by sophisticated biochemical methods. It is impressive that the authors successfully optimized methods for the analyses and quantification of ß-1-6-glucan under different environmental conditions and in different mutant strains.

      Weaknesses:

      However, although already very interesting, at this stage there are some loose ends that need to be combined to strengthen the manuscript. For example, the immunological studies are rather preliminary and need at least some substantiation. Also, at this stage, the manuscript in some places remains a bit too descriptive and needs the elucidation of potential causalities.

    4. Reviewer #2 (Public Review):

      Summary:

      The authors provide the first (to my knowledge) detailed characterization of cell wall b-1,6 glucan in the pathogen Candida albicans. The approaches range from biochemistry to genetics to immunology. The study provides fundamental information and will be a resource of exceptional value to the field going forward. Highlights include the construction of a mutant that lacks all b-1,6 glucan and the characterization of its cell wall composition and structure. Figure 5a is a feast for the eyes, showing that b-1,6 glucan is vital for the outer fibrillar layer of the cell wall. Also much appreciated was the summary figure, Figure 7, which presents the main findings in digestible form.

      Strengths:

      The work is highly significant for the fungal pathogen field especially, and more broadly for anyone studying fungi, antifungal drugs, or antifungal immune responses.

      The manuscript is very readable, which is important because most readers will be cell wall nonspecialists.

      The authors construct a key quadruple mutant, which is not trivial even with CRISPR methods, and validate it with a complemented strain. This aspect of the study sets the bar high.

      The authors develop new and transferable methods for b-1,6 glucan analysis.

      Weaknesses:

      The one "famous" cell type that would have been interesting to include is the opaque cell. This could be included in a future paper.

    5. Reviewer #3 (Public Review):

      Summary:

      The cell wall of human fungal pathogens, such as Candida albicans, is crucial for structural support and modulating the host immune response. Although extensively studied in yeasts and molds, the structural composition has largely focused on the structural glucan b,1,3-glucan and the surface exposed mannans, while the fibrillar component β-1,6-glucan, a significant component of the well wall, has been largely overlooked. This comprehensive biochemical and immunological study by a highly experienced cell wall group provides a strong case for the importance of β-1,6-glucan contributing critically to cell wall integrity, filamentous growth, and cell wall stability resulting from defects in mannan elongation. Additionally, β-1,6-glucan responds to environmental stimuli and stresses, playing a key role in wall remodeling and immune response modulation, making it a potential critical factor for host-pathogen interactions.

      Strengths:

      Overall, this study is well-designed and executed. It provides the first comprehensive assessment of β-1,6-glucan as a dynamic, albeit underappreciated, molecule. The role of β-1,6-glucan genetics and biochemistry has been explored in molds like Aspergillus fumigatus, but this work shines an important light on its role in Candida albicans. This is important work that is of value to Medical Mycology, since β-1,6-glucan plays more than just a structural role in the wall. It may serve as a PAMP and a potential modulator of host-pathogen interactions. In keeping with this important role, the manuscript rigor would benefit from a more physiological evaluation ex vivo and preferably in vivo, assessment on stimulating the immune system within in the cell wall and not just as a purified component. This is a critical outcome measure for this study and gets squarely at its importance for host-pathogen interactions, especially in response to environmental stimuli and drug exposure.

    1. Reviewer #1 (Public Review):

      Summary:<br /> In this study, Masroor Ahmad Paddar and his/her colleagues explore the noncanonical roles of ATG5 and membrane atg8ylation in regulating retromer assembly and function. They begin by examining the interactomes of ATG5 and expand the scope of these effects to include homeostatic responses to membrane stress and damage.

      Strengths:<br /> This study provides novel insights into the noncanonical function of ATG8ylation in endosomal cargo sorting process.

      Weaknesses:<br /> The direct mechanism by which ATG8ylation regulates the retromer remains unsolved.

    2. Reviewer #2 (Public Review):

      Summary: Padder et al. demonstrate that ATG5 mediates lysosomal repair via the recruitment of the retromer components during LLOMe-induced lysosomal damage and that mAtg8-ylation contributes to retromer-dependent cargo sorting of GLUT1. Although previous studies have suggested that during glucose withdrawal, classical autophagy contributes to retromer-dependent GLUT1 surface trafficking via interactions between LC3A and TBC1D5, the experiments here demonstrate that during basal conditions or lysosomal damage, ATGs that are not involved in mATG8ylation, such as FIP200, are not functionally required for retromer-dependent sorting of GLUT1. Overall, these studies suggest a unique role for ATG5 in the control of retromer function, and that conjugation of ATG8 to single membranes (CASM) is a partial contributor to these phenotypes.

      Strengths:

      (1) Overall, these studies suggest a unique non-autophagic role for ATG5 in the control of retromer function. They also demonstrate that conjugation of ATG8 to single membranes (CASM) is a partial contributor to these phenotypes. Overall, these data point to a new role for ATG5 and CASM-dependent mATG8ylation in lysosomal membrane repair and trafficking.

      (2) Although the studies are overall supportive of the proposed model that the retromer is controlled by CASM-dependent mATG8-ylaytion, it is noteworthy that previous studies of GLUT1 trafficking during glucose withdrawal (Roy et al. Mol Cell, PMID: 28602638) were predominantly conducted in cells lacking ATG5 or ATG7, which would not be able to discriminate between a CASM-dependent vs. canonical autophagy-dependent pathway in the control of GLUT1 sorting. Is the lack of GLUT1 mis-sorting to lysosomes observed in FIP200 and ATG13KO cells also observed during glucose withdrawal? Notably, deficiencies in glycolysis and glucose-dependent growth have been reported in FIP200 deficient fibroblasts (Wei et al. G&D, PMID: 21764854) so there may be differences in regulation dependent on the stress imposed on a cell.

      Weaknesses:

      (1) Additional controls are needed to clarify the role of CASM in the control of retromer function. Because the manuscript proposes both CASM-dependent and independent pathways in the ATG5 mediated regulation of the retromer, it is important to provide robust evidence that CASM is required for retromer-dependent GLUT1 sorting to the plasma membrane vs. lysosome. The experiments with monsensin in Fig. 7C-E are consistent with but not unequivocally corroborative of a role for CASM. Based on the results shown with ATG16KO in Fig 4A-D, rescue experiments of these 16KO cells with WT vs. C-terminal WD40 mutant versions of ATG16 will specifically assess the requirement for CASM and potentially provide more rigorous support for the conclusions drawn.

      (2) Also, the role of TBC1D5 should be further clarified. In Fig S7, are there any changes in the interactions between TBC1D5 and VPS35 in response to LLOMe or other agents utilized to induce CASM? Does TBC1D5 loss-of-function modulate the numbers of GLUT1 and Gal3 puncta observed in ATG5 deficient cells in response to LLOMe?

      (3) Finally, the studies here are motivated by experiments in Fig. S1 (as well as other studies from the Deretic and Stallings labs) suggesting unique autophagy-independent functions for ATG5 in myeloid cells and neutrophils in susceptibility to Mycobacterium tuberculosis infection. However, it is curious that no attempt is made to relate the mechanistic data regarding the retromer or GLUT1 receptor mis-sorting back to the infectious models. Do myeloid cells or neutrophils lacking ATG5 have deficiencies in glucose uptake or GLUT1 cell surface levels?

    3. Reviewer #3 (Public Review):

      In this manuscript, Padder et al. used APEX2 proximity labeling to find an interaction between ATG5 and the core components of the Retromer complex, VPS26, VPS29, and VPS35. Further studies revealed that ATG5 KO inhibited the trafficking of GLUT1 to the plasma membrane. They also found that other autophagy genes involved in membrane atg8ylation affected GLUT1 sorting. However, knocking out other essential autophagy genes such as ATG13 and FIP200 did not affect GLUT1 sorting. These findings suggest that ATG5 participates in the function of the Retromer in a noncanonical autophagy manner. Overall, the methods and techniques employed by the authors largely support their conclusions. These findings are intriguing and significant, enriching our understanding of the non-autophagic functions of autophagy proteins and the sorting of GLUT1. Nevertheless, there are several issues that the authors need to address to further clarify their conclusions.

      (1) The authors confirmed the interaction between Atg5 and the Retromer complex through Co-IP experiments. Is the interaction between Atg5 and the Retromer direct? If it is direct, which Retromer complex protein regulates the interaction with Atg5? Additionally, does ATG5 K130R mutant enhance its interaction with the Retromer?

      (2) To more directly elucidate how ATG5 regulates Retromer function by interacting with the Retromer and participates in the trafficking of GLUT1 to the plasma membrane, the authors should identify which region or crucial amino acid residues of ATG5 regulate its interaction with the Retromer. Additionally, they should test whether mutations in ATG5 that disrupt its interaction with the Retromer affect Retromer function (such as participating in the trafficking of GLUT1 to the plasma membrane) and whether they affect Atg8ylation. They also need to assess whether these mutations influence canonical autophagy and lysosomal sensitivity to damage.

    4. Author response:

      Reviewer #1 (Public Review): 

      Summary: 

      In this study, Masroor Ahmad Paddar and his/her colleagues explore the noncanonical roles of ATG5 and membrane ATG8ylation in regulating retromer assembly and function. They begin by examining the interactomes of ATG5 and expand the scope of these effects to include homeostatic responses to membrane stress and damage. 

      Strengths: 

      This study provides novel insights into the noncanonical function of ATG8ylation in endosomal cargo sorting process. 

      Weaknesses: 

      The direct mechanism by which ATG8ylation regulates the retromer remains unsolved. 

      We agree with the reviewer.  We do however show how at least one aspect of ATG8ylation contributes to the proper retromer function, which occurs via lysosomal membrane maintenance and repair. Understanding the more direct effects on retromer will require a separate study. We will emphasize this in the revised manuscript and point out the limitations of the present work.

      Reviewer #2 (Public Review): 

      Summary:

      Padder et al. demonstrate that ATG5 mediates lysosomal repair via the recruitment of the retromer components during LLOMe-induced lysosomal damage and that mAtg8-ylation contributes to retromer-dependent cargo sorting of GLUT1. Although previous studies have suggested that during glucose withdrawal, classical autophagy contributes to retromer-dependent GLUT1 surface trafficking via interactions between LC3A and TBC1D5, the experiments here demonstrate that during basal conditions or lysosomal damage, ATGs that are not involved in mATG8ylation, such as FIP200, are not functionally required for retromer-dependent sorting of GLUT1. Overall, these studies suggest a unique role for ATG5 in the control of retromer function, and that conjugation of ATG8 to single membranes (CASM) is a partial contributor to these phenotypes. 

      Strengths: 

      (1) Overall, these studies suggest a unique non-autophagic role for ATG5 in the control of retromer function. They also demonstrate that conjugation of ATG8 to single membranes (CASM) is a partial contributor to these phenotypes. Overall, these data point to a new role for ATG5 and CASM-dependent mATG8ylation in lysosomal membrane repair and trafficking. 

      (2) Although the studies are overall supportive of the proposed model that the retromer is controlled by CASM-dependent mATG8-ylaytion, it is noteworthy that previous studies of GLUT1 trafficking during glucose withdrawal (Roy et al. Mol Cell, PMID: 28602638) were predominantly conducted in cells lacking ATG5 or ATG7, which would not be able to discriminate between a CASM-dependent vs. canonical autophagy-dependent pathway in the control of GLUT1 sorting. Is the lack of GLUT1 mis-sorting to lysosomes observed in FIP200 and ATG13KO cells also observed during glucose withdrawal? Notably, deficiencies in glycolysis and glucose-dependent growth have been reported in FIP200 deficient fibroblasts (Wei et al. G&D, PMID: 21764854) so there may be differences in regulation dependent on the stress imposed on a cell. 

      We thank the reviewer on the overall assessment of the strengths of the study.

      We have discussed in the manuscript the elegant study by Roy et al., PMID 28602683. To accommodate reviewer’s comment, we will additionally emphasize in the text that our study is focused on basal conditions and conditions that perturb endolysosomal compartments. We agree with the reviewer that under metabolic stress conditions (such as glucose limitation) more complex pathways may be engaged and will acknowledge that in the discussion.

      Weaknesses: 

      (1) Additional controls are needed to clarify the role of CASM in the control of retromer function. Because the manuscript proposes both CASM-dependent and independent pathways in the ATG5 mediated regulation of the retromer, it is important to provide robust evidence that CASM is required for retromer-dependent GLUT1 sorting to the plasma membrane vs. lysosome. The experiments with monsensin in Fig. 7C-E are consistent with but not unequivocally corroborative of a role for CASM.

      We fully agree with the reviewer. In fact, our data with bafilomycin A1 treatment causing GLUT1 miss-sorting (manuscript line 317) show that it is the perturbance of lysosomes  and not CASM per se that leads to mis-sorting of GLUT1 (Fig. 7D,E). Note that it has been shown (PMIDs: 28296541, 25484071 and 37796195) that although bafilomycin A1 deacidifies lysosomes it does not induce but instead inhibits CASM. This is because bafilomycin A1 cases dissociation of V1 and V0 sectors of V-ATPase, unlike other CASM-inducing agents which promote V1 V0 association. Complementing this, our data with ATG2AB DKO and ESCRT VPS37A KO (Fig. 8A-F) indicate that the repair of lysosomes is important to keep the retromer machinery functional (as illustrated in Fig. 8G). This may be one of the effector mechanisms downstream of membrane atg8ylation in general and hence also downstream of CASM. We will revise Fig. 7 title to read “Lysosomal damage causes GLUT1 mis-sorting” and will explain these relationships in the text.

      Based on the results shown with ATG16KO in Fig 4A-D, rescue experiments of these 16KO cells with WT vs. C-terminal WD40 mutant versions of ATG16 will specifically assess the requirement for CASM and potentially provide more rigorous support for the conclusions drawn. 

      We will carry out the experiment proposed by the reviewer for the planned revision.

      (2) Also, the role of TBC1D5 should be further clarified. In Fig S7, are there any changes in the interactions between TBC1D5 and VPS35 in response to LLOMe or other agents utilized to induce CASM?

      We thank the reviewer for pointing this out. We do have data with VPS35 in co-IPs shown in Fig. S7.  There is no change in the amounts of VPS35 or TBC1D5 in GFP-LC3A co-IPs. We will include a graph with quantification in the revised manuscript and emphasize this point.

      Does TBC1D5 loss-of-function modulate the numbers of GLUT1 and Gal3 puncta observed in ATG5 deficient cells in response to LLOMe? 

      We agree that TBC1D5 is an interesting aspect. However, because TBC1D5 does not change its interactions in the experiments in our study, we consider this topic (i.e. whether TBC1D5 phenocopies VPS35 and ATG5 KOs in its effects on Gal3) to be beyond the scope of the present work. We underscore that LLOMe (lysosomal damage) mis-sorts GLUT1 even without any genetic intervention (e.g., in WT cells in the absence of ATG5 KO; Fig. 7). Thus, in our opinion the effects of TBC1D5 inactivation may be a moot point.

      (3) Finally, the studies here are motivated by experiments in Fig. S1 (as well as other studies from the Deretic and Stallings labs) suggesting unique autophagy-independent functions for ATG5 in myeloid cells and neutrophils in susceptibility to Mycobacterium tuberculosis infection. However, it is curious that no attempt is made to relate the mechanistic data regarding the retromer or GLUT1 receptor mis-sorting back to the infectious models. Do myeloid cells or neutrophils lacking ATG5 have deficiencies in glucose uptake or GLUT1 cell surface levels? 

      Reviewer’s point is well taken. Glucose uptake, its metabolism, and diabetes underly resurgence in TB in certain populations and are important factors in a range of other diseases. This was alluded to in our discussion (lines 461-469). However, these are complex topics for future studies. We will expand this section of the discussion.

      Reviewer #3 (Public Review): 

      In this manuscript, Padder et al. used APEX2 proximity labeling to find an interaction between ATG5 and the core components of the Retromer complex, VPS26, VPS29, and VPS35. Further studies revealed that ATG5 KO inhibited the trafficking of GLUT1 to the plasma membrane. They also found that other autophagy genes involved in membrane atg8ylation affected GLUT1 sorting. However, knocking out other essential autophagy genes such as ATG13 and FIP200 did not affect GLUT1 sorting. These findings suggest that ATG5 participates in the function of the Retromer in a noncanonical autophagy manner. Overall, the methods and techniques employed by the authors largely support their conclusions. These findings are intriguing and significant, enriching our understanding of the non-autophagic functions of autophagy proteins and the sorting of GLUT1. Nevertheless, there are several issues that the authors need to address to further clarify their conclusions. 

      (1) The authors confirmed the interaction between Atg5 and the Retromer complex through Co-IP experiments. Is the interaction between Atg5 and the Retromer direct? If it is direct, which Retromer complex protein regulates the interaction with Atg5? Additionally, does ATG5 K130R mutant enhance its interaction with the Retromer? 

      AlphaFold modeling in the initial submission of our study to eLife (absent from the current version) suggested the possibility of a direct interaction between ATG5 and VPS35 with ATG12—ATG5 complex facing outwards, in which case K130R would not matter. However, mutational experiments in putative contact residues did not alter association in co-IPs. So either ATG5 interacts with other retromer subunits or more likely is in a larger protein complex containing retromer. It will take a separate study to dissect associations and find direct interaction partners. We can provide our data on the currently available modeling and mutational analyses in a full point-for-point rebuttal but believe that since they are inconclusive, they should not be included in the study.

      (2) To more directly elucidate how ATG5 regulates Retromer function by interacting with the Retromer and participates in the trafficking of GLUT1 to the plasma membrane, the authors should identify which region or crucial amino acid residues of ATG5 regulate its interaction with the Retromer. Additionally, they should test whether mutations in ATG5 that disrupt its interaction with the Retromer affect Retromer function (such as participating in the trafficking of GLUT1 to the plasma membrane) and whether they affect Atg8ylation. They also need to assess whether these mutations influence canonical autophagy and lysosomal sensitivity to damage. 

      Please see the response to point 1.

      We thank the editors and reviewers for their assessment, constructive criticisms and recommendations.

    1. eLife assessment

      This study presents an important finding of dynamic reprogramming of global H3K4me2 during mouse oocyte-to-embryo transition. While the H3K4me2 epigenome data is convincing, the interpretation and the potential mechanistic claims of the authors are incomplete in the current shape with the primary concerns regarding the contribution of Kdm1b or Kdm1a, as well as the specificity of the inhibitor and the antibody. The work will be of interest to researchers interested in epigenetic reprogramming.

    2. Reviewer #1 (Public Review):

      By mapping H3K4me2 in mouse oocytes and pre-implantation embryos, the authors aim to elucidate how this histone modification is erased and re-established during the parental-to-zygotic transition, as well as how the reprogramming of H3K4me2 regulates gene expression and facilitates zygotic genome activation.

      Employing an improved CUT&RUN approach, the authors successfully generated H3K4me2 profiling data from a limited number of embryos. While the profiling experiments are very well executed, several weaknesses, particularly in data analysis, are apparent:

      (1) The study emphasizes H3K4me2, which often serves as a precursor to H3K4me3, a well-studied modification during early development. Analyzing the new H3K4me2 dataset alongside published H3K4me3 data is crucial for comprehensively understanding epigenetic reprogramming post-fertilization and the interplay between histone modifications. However, the current analysis is preliminary and lacks depth.

      (2) Tranylcypromine (TCP) is known as an irreversible inhibitor of monoamine oxidase and LSD1. While the authors suggest TCP inhibits the expression of LSD2, this assertion is questionable. Given TCP's potential non-specific effects in cells, conclusions related to the experiments using TCP should be made with caution.

      (3) Some batches of H3K4me2 antibody are known to cross-react with H3K4me3. Has the H3K4me2 antibody used in CUT&RUN been tested for such cross-reactivity? Heatmaps in the figures indeed show similar distribution for H3K4me2 and H3K4me3, further raising concerns about antibody specificity.

      (4) Certain statements lack supporting references or figures (examples on page 9 can be found on line 245, line 254, and line 258).

      (5) Extensive language editing is recommended to clarify ambiguous sentences. Additionally, caution should be taken to avoid overstatement - most analyses in this study only suggest correlation rather than causality.

    3. Reviewer #2 (Public Review):

      Chong Wang et al. investigated the role of H3K4me2 during the reprogramming processes in mouse preimplantation embryos. The authors show that H3K4me2 is erased from GV to MII oocytes and re-established in the late 2-cell stage by performing Cut & Run H3K4me2 and immunofluorescence staining. Erasure and re-establishment of H3K4me2 have not been studied well, and profiling of H3K4me2 in germ cells and preimplantation embryos is valuable to understanding the reprogramming process and epigenetic inheritance.

      (1) The authors claim that the Cut & Run worked for MII oocytes, zygotes, and the 2-cell embryos. However, it is unclear if H3K4me2 is erased during the stage or if the Cut & Run did not work for these samples. To support the hypothesis of the erasure of H3K4me2, the authors conducted immunofluorescence staining, and H3k4me2 was undetected in the MII oocyte, PN5, and 2-cell stage. However, the published papers showed strong staining of H3K4me2 at the zygote stage and 2-cell stage ((Ancelin et al., 2016; Shao et al., 2014)). The authors need to cite these papers and discuss the contradictory findings.

      The authors used 165 MII oocytes and 190 GV oocytes for the Cut & Run. The amount of DNA in MII oocytes is halved because of the emission of the first polar body. Would it be a reason that H3K4me2 has fewer H3K4me2 peaks in MII oocytes than GV oocytes?

      In Figure 3C, 98% (13,183/13,428) of H3K4me2 marked genes in GV oocytes overlap with those in the 4-cell stage. Furthermore, 92% (14,049/15,112) of H3K4me2 marked genes in sperm overlap with those in the 4-cell stage. Therefore, most regions maintain germ line-derived H3K4me2 in the 4-cell stage. The authors need to clarify which regions of germ line-derived H3K4me2 are maintained or erased in preimplantation embryos. Additionally, it would be interesting to investigate which regions show the parental allele-specific H3K4me2 in preimplantation embryos since the authors used hybrid preimplantation embryos (B6 x DBA).

      (2) The authors claim that Kdm1a is rarely expressed during mouse embryonic development (Figure 4A). However, the published paper showed that KDM1a is present in the zygote and 2-cell stage using immunostaining and western blotting ((Ancelin et al., 2016)). Additionally, this paper showed that depletion of maternal KDM1A protein results in developmental arrest at the two-cell stage, and therefore, KDM1a is functionally important in early development. The authors should have cited the paper and described the role of KDM1a in early embryos.

      (3) The authors used the published RNA data set and interpreted that KDM1B (LSD2) was highly expressed at the MII stage (Figure S3A). However, the heat map shows that KDM1B expression is high in growing oocytes but not at 8w_oocytes and MII oocytes. The authors need to interpret the data accurately.

      (4) All embryos in the TCP group were arrested at the four-cell stage. Embryos generated from KDM1b KO females can survive until E10.5 (Ciccone et al., 2009); therefore, TCP-treated embryos show a more severe phenotype than oocyte-derived KDM1b deleted embryos. Depletion of maternal KDM1A protein results in developmental arrest at the two-cell stage ((Ancelin et al., 2016)). The authors need to examine whether TCP treatment affects KDM1a expression. Western blotting would be recommended to quantify the expression of KDM1A and KDM1B in the TCP-treated embryos.

      (5) H3K4me2 is increased dramatically in the TCP-treated embryos in Figure 4 (the intensity is 1,000 times more than the control). However, the Cut & Run H3K4me2 shows that the H3K4me2 signal is increased in 251 genes and decreased in 194 genes in the TCP-treated embryos (Fold changes > 2, P < 0.01). The authors need to explain why the gain of H3K4me2 is less evident in the Cut & Run data set than in the immunofluorescence result.

      References

      Ancelin, K., ne Syx, L., Borensztein, M., mie Ranisavljevic, N., Vassilev, I., Briseñ o-Roa, L., Liu, T., Metzger, E., Servant, N., Barillot, E., Chen, C.-J., Schü le, R., & Heard, E. (2016). Maternal LSD1/KDM1A is an essential regulator of chromatin and transcription landscapes during zygotic genome activation. https://doi.org/10.7554/eLife.08851.001

      Ciccone, D. N., Su, H., Hevi, S., Gay, F., Lei, H., Bajko, J., Xu, G., Li, E., & Chen, T. (2009). KDM1B is a histone H3K4 demethylase required to establish maternal genomic imprints. Nature, 461(7262), 415-418. https://doi.org/10.1038/nature08315

      Shao, G. B., Chen, J. C., Zhang, L. P., Huang, P., Lu, H. Y., Jin, J., Gong, A. H., & Sang, J. R. (2014). Dynamic patterns of histone H3 lysine 4 methyltransferases and demethylases during mouse preimplantation development. In Vitro Cellular and Developmental Biology - Animal, 50(7), 603-613. https://doi.org/10.1007/s11626-014-9741-6

    4. Reviewer #3 (Public Review):

      Summary:

      This study explores the dynamic reprogramming of histone modification H3K4me2 during the early stages of mammalian embryogenesis. Utilizing the advanced CUT&RUN technique coupled with high-throughput sequencing, the authors investigate the erasure and re-establishment of H3K4me2 in mouse germinal vesicle (GV) oocytes, metaphase II (MII) oocytes, and early embryos.

      Strengths:

      The findings provide valuable insights into the temporal and spatial dynamics of H3K4me2 and its potential role in zygotic genome activation (ZGA).

      Weaknesses:

      The study primarily remains descriptive at this point. It would be advantageous to conduct further comprehensive functional validation and mechanistic exploration.<br /> Key areas for improvement include enhancing the innovation and novelty of the study, providing robust functional validation, establishing a clear model for H3K4me2's role, and addressing technical and presentation issues. The text would benefit from the introduction of a novel conceptual framework or model that provides a clear explanation of the functional consequences and molecular mechanisms underlying H3K4me2 reprogramming in the transition from parental to early embryonic development.

      While the findings are significant, the current manuscript falls short in several critical areas. Addressing major and minor issues will significantly strengthen the study's contribution to the field of epigenetic reprogramming and embryonic development.

    5. Author response:

      Public Reviews:

      Reviewer #1 (Public Review): 

      By mapping H3K4me2 in mouse oocytes and pre-implantation embryos, the authors aim to elucidate how this histone modification is erased and re-established during the parental-to-zygotic transition, as well as how the reprogramming of H3K4me2 regulates gene expression and facilitates zygotic genome activation.

      Employing an improved CUT&RUN approach, the authors successfully generated H3K4me2 profiling data from a limited number of embryos. While the profiling experiments are very well executed, several weaknesses, particularly in data analysis, are apparent:

      (1) The study emphasizes H3K4me2, which often serves as a precursor to H3K4me3, a well-studied modification during early development. Analyzing the new H3K4me2 dataset alongside published H3K4me3 data is crucial for comprehensively understanding epigenetic reprogramming post-fertilization and the interplay between histone modifications. However, the current analysis is preliminary and lacks depth.

      Thank you very much for your valuable suggestions. The data of histone H3K4me3 in humans and mice has been published,and our previous data revealed the unique pattern of H3K4me3 during early human embryos and oocytes (Xia et al., 2019). So, this study mainly focuses on the localization of H3K4me2 in mouse oocytes and preimplantation embryos, how it is erased and re-established during mammalian parental-to-zygote transition, and its function. The combined analysis of H3K4me2 and H3K4me3 is not our main work, but it is not ruled out that there may be new discoveries between these two histones. Previously, our data tended to show that the H3K4me2 not only acts as a precursor of H3K4me3, but also plays its role independently.

      (2) Tranylcypromine (TCP) is known as an irreversible inhibitor of monoamine oxidase and LSD1. While the authors suggest TCP inhibits the expression of LSD2, this assertion is questionable. Given TCP's potential non-specific effects in cells, conclusions related to the experiments using TCP should be made with caution.

      Thank you for pointing this out, and we thank the reviewer again for the important suggestion. We found that the previous study indicated that TCP was a non-reversible inhibitor of LSD1 and LSD2, but according to our data, the content of LSD1 was very low in the early stages of mouse embryos, which mainly inhibited the function of LSD2. (Binda et al., 2010; Fang et al., 2010 )

      (3) Some batches of H3K4me2 antibody are known to cross-react with H3K4me3. Has the H3K4me2 antibody used in CUT&RUN been tested for such cross-reactivity? Heatmaps in the figures indeed show similar distribution for H3K4me2 and H3K4me3, further raising concerns about antibody specificity.

      We thank the reviewer for the insightful comments. The H3K4me2 antibody was purchased from Millipore (cat. 07030). Figure 2A shows the specific enrichment area of H3K4me2 in promoter and distal region. Some batches of H3K4me2 antibody are known to cross-react with H3K4me3, but the H3K4me2 antibody we used in our CUT&RUN seems to have Low cross-reactivity.

      (4) Certain statements lack supporting references or figures (examples on page 9 can be found on line 245, line 254, and line 258).

      Thank you for pointing this out, and we will add references to support the statement in the paper as suggested.

      (5) Extensive language editing is recommended to clarify ambiguous sentences. Additionally, caution should be taken to avoid overstatement - most analyses in this study only suggest correlation rather than causality.

      Thank you for your kind comments. We will revise the expression in the manuscript later.

      Reviewer #2 (Public Review):

      Chong Wang et al. investigated the role of H3K4me2 during the reprogramming processes in mouse preimplantation embryos. The authors show that H3K4me2 is erased from GV to MII oocytes and re-established in the late 2-cell stage by performing Cut & Run H3K4me2 and immunofluorescence staining. Erasure and re-establishment of H3K4me2 have not been studied well, and profiling of H3K4me2 in germ cells and preimplantation embryos is valuable to understanding the reprogramming process and epigenetic inheritance.

      (1) The authors claim that the Cut & Run worked for MII oocytes, zygotes, and the 2-cell embryos. However, it is unclear if H3K4me2 is erased during the stage or if the Cut & Run did not work for these samples. To support the hypothesis of the erasure of H3K4me2, the authors conducted immunofluorescence staining, and H3k4me2 was undetected in the MII oocyte, PN5, and 2-cell stage. However, the published papers showed strong staining of H3K4me2 at the zygote stage and 2-cell stage ((Ancelin et al., 2016; Shao et al., 2014)). The authors need to cite these papers and discuss the contradictory findings.

      The authors used 165 MII oocytes and 190 GV oocytes for the Cut & Run. The amount of DNA in MII oocytes is halved because of the emission of the first polar body. Would it be a reason that H3K4me2 has fewer H3K4me2 peaks in MII oocytes than GV oocytes?

      First of all, thank you for your valuable advice. The published papers showed strong staining of H3K4me2 at the zygote stage and 2-cell stage, which is interesting. I think we may have used different parameters in the confocal laser shooting process(Ancelin et al., 2016). We used the same parameter to continuously shoot the blastocyst stage from the GV stage. If we only shot the fertilized egg and the 2-cell stage, I think we may also see weak fluorescence at the 2-cell stage under different parameters. We will refer to this reference and discuss it in the resubmitted version.

      Moreover, you mentioned the H3K4me2 has fewer H3K4me2 peaks in MII oocytes than GV oocytes, because the MII expelled the polar body. There is no problem with this logic. However, the first polar body expelled from the MII stage is still in the zona pellucida, and we also collected the polar body in the CUT&RUN experiment; Therefore, compared to GV, the DNA content of MII samples is not halved. After further discussion, we believe that the reduction of H3K4me2 peaks in MII stage compared with GV stage may be closely related to oocyte maturation. It is the specific modification of histones in different forms at different times that affects the chromatin structure change appropriately with the different stages of meiosis. At present, it has been confirmed that H3K4me3 gradually decreases from GV to MII stage during the maturation of human oocytes. H3K27me3 did not change from GV to MII stage.

      In Figure 3C, 98% (13,183/13,428) of H3K4me2 marked genes in GV oocytes overlap with those in the 4-cell stage. Furthermore, 92% (14,049/15,112) of H3K4me2 marked genes in sperm overlap with those in the 4-cell stage. Therefore, most regions maintain germ line-derived H3K4me2 in the 4-cell stage. The authors need to clarify which regions of germ line-derived H3K4me2 are maintained or erased in preimplantation embryos. Additionally, it would be interesting to investigate which regions show the parental allele-specific H3K4me2 in preimplantation embryos since the authors used hybrid preimplantation embryos (B6 x DBA).

      Thank you very much for your suggestion. Further analysis of which regions show the parental allele-specific H3K4me2 in preimplantation embryos will make the study more interesting. We will discuss this in depth in resubmitted vision.

      (2) The authors claim that Kdm1a is rarely expressed during mouse embryonic development (Figure 4A). However, the published paper showed that KDM1a is present in the zygote and 2-cell stage using immunostaining and western blotting ((Ancelin et al., 2016)). Additionally, this paper showed that depletion of maternal KDM1A protein results in developmental arrest at the two-cell stage, and therefore, KDM1a is functionally important in early development. The authors should have cited the paper and described the role of KDM1a in early embryos.

      In the analysis of this experiment, we believe that in the early embryonic development of mice, the expression of KDM1A is lower than that of KDM1B, which is relative. Similarly, the transcriptome data we cite also show that KDM1A is expressed at elevated levels during oocyte maturation and fertilization compared to immature oocytes. In addition, the effects of loss of maternal KDM1a on embryonic development were not discussed. We believe that the absence of maternal KDM1b blocks embryonic development, and we will cite and discus the references later.

      (3) The authors used the published RNA data set and interpreted that KDM1B (LSD2) was highly expressed at the MII stage (Figure S3A). However, the heat map shows that KDM1B expression is high in growing oocytes but not at 8w_oocytes and MII oocytes. The authors need to interpret the data accurately.

      After re-checking the data, we found that there was a problem with the normalization method of our heat map, and we will re-make the heatmap and submit it in the modified version. With reference to Figure 4A, the content of Kdm1b is indeed higher than that of Kdm1a.

      (4) All embryos in the TCP group were arrested at the four-cell stage. Embryos generated from KDM1b KO females can survive until E10.5 (Ciccone et al., 2009); therefore, TCP-treated embryos show a more severe phenotype than oocyte-derived KDM1b deleted embryos. Depletion of maternal KDM1A protein results in developmental arrest at the two-cell stage ((Ancelin et al., 2016)). The authors need to examine whether TCP treatment affects KDM1a expression. Western blotting would be recommended to quantify the expression of KDM1A and KDM1B in the TCP-treated embryos.

      We will further dig the transcriptome data to confirm the specificity of TCP to KDM1b. In addition, the intervention of TCP on the whole fertilized egg in this study increased the H3K4me2 content, and the embryo development retarding effect was more significant than that obtained by crossing with normal paternal lines after knocking down KDM1B from the mother.

      (5) H3K4me2 is increased dramatically in the TCP-treated embryos in Figure 4 (the intensity is 1,000 times more than the control). However, the Cut & Run H3K4me2 shows that the H3K4me2 signal is increased in 251 genes and decreased in 194 genes in the TCP-treated embryos (Fold changes > 2, P < 0.01). The authors need to explain why the gain of H3K4me2 is less evident in the Cut & Run data set than in the immunofluorescence result.

      Thanks a lot for your question. In the experimental group, the fluorescence value of H3K4me2 in IF was increased by 1000 times (Figure 4E), and the expression of H3K4Me2-related genes in CR was up-regulated and down-regulated for a total of 445 changes (Figure 6A). In our opinion, as a semi-quantitative analysis, immunofluorescence cannot be compared with the quantitative analysis method of CR because of the different analysis models and threshold Settings.

      References

      Ancelin, K., ne Syx, L., Borensztein, M., mie Ranisavljevic, N., Vassilev, I., Briseñ o-Roa, L., Liu, T., Metzger, E., Servant, N., Barillot, E., Chen, C.-J., Schü le, R., & Heard, E. (2016). Maternal LSD1/KDM1A is an essential regulator of chromatin and transcription landscapes during zygotic genome activation. https://doi.org/10.7554/eLife.08851.001

      Ciccone, D. N., Su, H., Hevi, S., Gay, F., Lei, H., Bajko, J., Xu, G., Li, E., & Chen, T. (2009). KDM1B is a histone H3K4 demethylase required to establish maternal genomic imprints. Nature, 461(7262), 415-418. https://doi.org/10.1038/nature08315

      Shao, G. B., Chen, J. C., Zhang, L. P., Huang, P., Lu, H. Y., Jin, J., Gong, A. H., & Sang, J. R. (2014). Dynamic patterns of histone H3 lysine 4 methyltransferases and demethylases during mouse preimplantation development. In Vitro Cellular and Developmental Biology - Animal, 50(7), 603-613. https://doi.org/10.1007/s11626-014-9741-6

      References

      Xia W, Xu J, Yu G, Yao G, Xu K, Ma X, Zhang N, Liu B, Li T, Lin Z, Chen X, Li L, Wang Q, Shi D, Shi S, Zhang Y, Song W, Jin H, Hu L, Bu Z, Wang Y, Na J, Xie W, Sun YP. Resetting histone modifications during human parental-to-zygotic transition. Science. 2019 Jul 26;365(6451):353-360. doi: 10.1126/science.aaw5118. Epub 2019 Jul 4. PMID: 31273069.

      Binda C, Valente S, Romanenghi M, Pilotto S, Cirilli R, Karytinos A, Ciossani G, Botrugno OA, Forneris F, Tardugno M, Edmondson DE, Minucci S, Mattevi A, Mai A. Biochemical, structural, and biological evaluation of tranylcypromine derivatives as inhibitors of histone demethylases LSD1 and LSD2. J Am Chem Soc. 2010 May 19;132(19):6827-33.

      Fang R, Barbera AJ, Xu Y, Rutenberg M, Leonor T, Bi Q, Lan F, Mei P, Yuan GC, Lian C, Peng J, Cheng D, Sui G, Kaiser UB, Shi Y, Shi YG. Human LSD2/KDM1b/AOF1 regulates gene transcription by modulating intragenic H3K4me2 methylation. Mol Cell. 2010 Jul 30;39(2):222-33. doi: 10.1016/j.molcel.2010.07.008. PMID: 20670891; PMCID: PMC3518444.

      Ancelin K, Syx L, Borensztein M, Ranisavljevic N, Vassilev I, Briseño-Roa L, Liu T, Metzger E, Servant N, Barillot E, Chen CJ, Schüle R, Heard E. Maternal LSD1/KDM1A is an essential regulator of chromatin and transcription landscapes during zygotic genome activation. Elife. 2016 Feb 2;5:e08851. doi: 10.7554/eLife.08851. PMID: 26836306; PMCID: PMC4829419.

      Reviewer #3 (Public Review):

      Summary:

      This study explores the dynamic reprogramming of histone modification H3K4me2 during the early stages of mammalian embryogenesis. Utilizing the advanced CUT&RUN technique coupled with high-throughput sequencing, the authors investigate the erasure and re-establishment of H3K4me2 in mouse germinal vesicle (GV) oocytes, metaphase II (MII) oocytes, and early embryos.

      Strengths:

      The findings provide valuable insights into the temporal and spatial dynamics of H3K4me2 and its potential role in zygotic genome activation (ZGA).

      Weaknesses:

      The study primarily remains descriptive at this point. It would be advantageous to conduct further comprehensive functional validation and mechanistic exploration.

      Key areas for improvement include enhancing the innovation and novelty of the study, providing robust functional validation, establishing a clear model for H3K4me2's role, and addressing technical and presentation issues. The text would benefit from the introduction of a novel conceptual framework or model that provides a clear explanation of the functional consequences and molecular mechanisms underlying H3K4me2 reprogramming in the transition from parental to early embryonic development.

      While the findings are significant, the current manuscript falls short in several critical areas. Addressing major and minor issues will significantly strengthen the study's contribution to the field of epigenetic reprogramming and embryonic development.

    1. Author response:

      The following is the authors’ response to the current reviews.

      Reviewer #1 (Public Review):

      The authors did a great job addressing the weaknesses I raised in the previous round of review, except on the generalizability of the current result in the larger context of multi-attribute decision-making. It is not really a weakness of the manuscript but more of a limitation of the studied topic, so I want to keep this comment for public readers.

      The reward magnitude and probability information are displayed using rectangular bars of different colors and orientations. Would that bias subjects to choose an additive rule instead of the multiplicative rule? Also, could the conclusion be extended to other decision contexts such as quality and price, where a multiplicative rule is hard to formulate?

      We thank the reviewer for the comment. With regards whether the current type of stimuli may have biased participants to use an additive rule rather, we believe many other forms of stimuli for representing choice attributes would be equally likely to cause a similar bias. This is because the additive strategy is an inherently simplistic and natural way to integrate different pieces of non-interacting information. More importantly, even though it is easy to employ an additive strategy, most participants still demonstrated some levels of employing the multiplicative rule. However, it would indeed be interesting for future studies to explore whether the current composite model remains dominant in situations where the optimal solutions require an additive or subtractive rule, such as those concerning quality and price.

      “The same would apply even with a different choice of cues as long as the information is conveyed by two independent visual features.”

      “While the additive strategy is a natural and simple approach for integrating non-interacting pieces of information, to some extent, participants also used the multiplicative strategy that was optimal in the current experiment. A general question for such composite models is whether people mix two strategies in a consistent manner on every trial or whether there is some form of probabilistic selection occurring between the two strategies on each trial such that only one strategy is used on any given trial while, on average, one strategy is more probable than the other. It would also be interesting to examine whether a composite model is appropriate in contexts where the optimal solution is additive or subtractive, such as those concerning quality and price.”


      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      The current study provided a follow-up analysis using published datasets focused on the individual variability of both the distraction effect (size and direction) and the attribute integration style, as well as the association between the two. The authors tried to answer the question of whether the multiplicative attribute integration style concurs with a more pronounced and positively oriented distraction effect.

      Strengths:

      The analysis extensively examined the impacts of various factors on decision accuracy, with a particular focus on using two-option trials as control trials, following the approach established by Cao & Tsetsos (2022). The statistical significance results were clearly reported.

      The authors meticulously conducted supplementary examinations, incorporating the additional term HV+LV into GLM3. Furthermore, they replaced the utility function from the expected value model with values from the composite model.

      We thank the reviewer for the positive response and are pleased that the reviewer found our report interesting.

      Reviewer #1 Comment 1

      Weaknesses:

      There are several weaknesses in terms of theoretical arguments and statistical analyses.

      First, the manuscript suggests in the abstract and at the beginning of the introduction that the study reconciled the "different claims" about "whether distraction effect operates at the level of options' component attributes rather than at the level of their overall value" (see line 13-14), but the analysis conducted was not for that purpose. Integrating choice attributes in either an additive or multiplicative way only reflects individual differences in combining attributes into the overall value. The authors seemed to assume that the multiplicative way generated the overall value ("Individuals who tended to use a multiplicative approach, and hence focused on overall value", line 20-21), but such implicit assumption is at odds with the statement in line 77-79 that people may use a simpler additive rule to combine attributes, which means overall value can come from the additive rule.

      We thank the reviewer for the comment. We have made adjustments to the manuscript to ensure that the message delivered within this manuscript is consistent. Within this manuscript, our primary focus is on the different methods of value integration in which the overall value is computed (i.e., additive, multiplicative, or both), rather than the interaction at the individual level of attributes. However, we do not exclude the possibility that the distractor effect may occur at multiple levels. Nevertheless, in light of the reviewer’s comment, we agree that we should focus the argument on whether distractors facilitate or impair decision making and downplay the separate argument about the level at which distractor effects operate. We have now revised the abstract:

      “It is widely agreed that people make irrational decisions in the presence of irrelevant distractor options. However, there is little consensus on whether decision making is facilitated or impaired by the presence of a highly rewarding distractor or whether the distraction effect operates at the level of options’ component attributes rather than at the level of their overall value. To reconcile different claims, we argue that it is important to incorporate consideration of the diversity of people’s ways of decision making. We focus on a recent debate over whether people combine choice attributes in an additive or multiplicative way. Employing a multi-laboratory dataset investigating the same decision making paradigm, we demonstrated that people used a mix of both approaches and the extent to which approach was used varied across individuals. Critically, we identified that this variability was correlated with the effect of the distractor on decision making. Individuals who tended to use a multiplicative approach to compute value, showed a positive distractor effect. In contrast, in individuals who tended to use an additive approach, a negative distractor effect (divisive normalisation) was prominent. These findings suggest that the distractor effect is related to how value is constructed, which in turn may be influenced by task and subject specificities. Our work concurs with recent behavioural and neuroscience findings that multiple distractor effects co-exist.” (Lines 12-26)

      Furthermore, we acknowledge that the current description of the additive rule could be interpreted in several ways. The current additive utility model described as:

      where  is the options’ utility,  is the reward magnitude,  is the probability, and  is the magnitude/probability weighing ratio . If we perform comparison between values according to this model (i.e., HV against LV), we would arrive at the following comparison:

      If we rearrange (1), we will arrive at:

      While equations (1) and (2) are mathematically equivalent, equation (1) illustrates the interpretation where the comparison of the utilities occurs after value integration and forming an overall value. On the other hand, equation (2) can be broadly interpreted as the comparison of individual attributes in the absence of an overall value estimate for each option. Nonetheless, while we do not exclude the possibility that the distractor effect may occur at multiple levels, we have made modifications to the main manuscript employ more consistently a terminology referring to different methods of value estimation while recognizing that our empirical results are compatible with both interpretations.

      Reviewer #1 Comment 2

      The second weakness is sort of related but is more about the lack of coherent conceptual understanding of the "additive rule", or "distractor effect operates at the attribute level". In an assertive tone (lines 77-80), the manuscript suggests that a weighted sum integration procedure of implementing an "additive rule" is equal to assuming that people compare pairs of attributes separately, without integration. But they are mechanistically distinct. The additive rule (implemented using the weighted sum rule to combine probability and magnitude within each option and then applying the softmax function) assumes value exists before comparing options. In contrast, if people compare pairs of attributes separately, preference forms based on the within-attribute comparisons. Mathematically these two might be equivalent only if no extra mechanisms (such as inhibition, fluctuating attention, evidence accumulation, etc) are included in the within-attribute comparison process, which is hardly true in the three-option decision.

      We thank the reviewer for the comment. As described in our response to Reviewer #1 Comment 1, we are aware and acknowledge that there may be multiple possible interpretations of the additive rule. We also agree with the reviewer that there may be additional mechanisms that are involved in three- or even two- option decisions, but these would require additional studies to tease apart. Another motivation for the approach used here, which does not explicitly model the extra mechanisms the reviewer refers to was due to the intention of addressing and integrating findings from previous studies using the same dataset [i.e. (Cao & Tsetsos, 2022; Chau et al., 2020)]. Lastly, regardless of the mechanistic interpretation, our results show a systematic difference in the process of value estimation. Modifications to the manuscript text have been made consistent with our motivation (please refer to the reply and the textual changes proposed in response to the reviewer’s previous comment: Reviewer #1 Comment 1).

      Reviewer #1 Comment 3

      Could the authors comment on the generalizability of the current result? The reward magnitude and probability information are displayed using rectangular bars of different colors and orientations. Would that bias subjects to choose an additive rule instead of the multiplicative rule? Also, could the conclusion be extended to other decision contexts such as quality and price, whether a multiplicative rule is hard to formulate?

      We thank the reviewer for the comment. We agree with the observation that the stimulus space, with colour linearly correlated with magnitude, and orientation linearly correlated with probability, may bias subjects towards an additive rule. But that’s indeed the point: in order to maximise reward, subjects should have focused on the outcome space without being driven by the stimulus space. In practice, people are more or less successful in such endeavour. Nevertheless, we argue that the specific choice of visual stimuli we used is no more biased towards additive space than any other. In fact, as long as two or more pieces of information are provided for each option, as opposed to a single cue whose value was previously learned, there will always be a bias towards an additive heuristic (a linear combination), regardless of whether the cues are shapes, colours, graphs, numbers, words.

      As the reviewer suggested, the dataset analyzed in the current manuscript suggests that the participants were leaning towards the additive rule. Although there was a general tendency using the additive rule while choosing between the rectangular bars, we can still observe a spread of individuals using either, or both, additive and multiplicative rules, suggesting that there was indeed diversity in participants’ decision making strategies in our data.

      In previous studies, it was observed that human and non-human individuals used a mix of multiplicative and additive rules when they were tested on experimental paradigms different from ours (Bongioanni et al., 2021; Farashahi et al., 2019; Scholl et al., 2014). It was also observed that positive and negative distractor effects can be both present in the same data set when human and non-human individuals made decisions about food and social partner (Chang et al., 2019; Louie et al., 2013). It was less clear in the past whether the precise way a distractor affects decision making (i.e., positive/negative distractor effect) is related to the use of decision strategy (i.e., multiplicative/additive rules) and this is exactly what we are trying to address in this manuscript. A follow-up study looking at neural data (such as functional magnetic resonance imaging data) could provide a better understanding of the mechanistic nature of the relationship between distractor effects and decision strategy that we identified here.

      We agree with the reviewer that it is true that a multiplicative strategy may not be applicable to some decision contexts. Here it is important to look at the structure of the optimal solution (the one maximizing value in the long run). Factors modulating value (such as probability and temporal delay) require a non-linear (e.g., multiplicative solution), while factors of the cost-benefit form (such as effort and price) require a linear solution (e.g., subtraction). In the latter scenario the additive heuristic would coincide with the optimal solution, and the effect addressed in this study may not be revealed. Nonetheless, the present data supports the notion of distinct neural mechanisms at least for probabilistic decision-making, and is likely applicable to decision-making in general.

      Our findings, in conjunction with the literature, also suggest that a positive distractor effect could be a general phenomenon in decision mechanisms that involve the medial prefrontal cortex. For example, it has been shown that the positive distractor effect is related to a decision mechanism linked to medial prefrontal cortex [especially the ventromedial prefrontal cortex (Chau et al., 2014; Noonan et al., 2017)]. It is also known a similar brain region is involved not only when individuals are combining information using a multiplicative strategy (Bongioanni et al., 2021), but also when they are combining information to evaluate new experience or generalize information (Baram et al., 2021; Barron et al., 2013; Park et al., 2021). We have now revised the Discussion to explain this:

      “In contrast, the positive distractor effect is mediated by the mPFC (Chau et al., 2014; Fouragnan et al., 2019). Interestingly, the same or adjacent, interconnected mPFC regions have also been linked to the mechanisms by which representational elements are integrated into new representations (Barron et al., 2013; Klein-Flügge et al., 2022; Law et al., 2023; Papageorgiou et al., 2017; Schwartenbeck et al., 2023). In a number of situations, such as multi-attribute decision making, understanding social relations, and abstract knowledge, the mPFC achieves this by using a spatial map representation characterised by a grid-like response (Constantinescu et al., 2016; Bongioanni et al., 2021; Park et al., 2021) and disrupting mPFC leads to the evaluation of composite choice options as linear functions of their components (Bongioanni et al., 2021). These observations suggest a potential link between positive distractor effects and mechanisms for evaluating multiple component options and this is consistent with the across-participant correlation that we observed between the strength of the positive distractor effect and the strength of non-additive (i.e., multiplicative) evaluation of the composite stimuli we used in the current task. Hence, one direction for model development may involve incorporating the ideas that people vary in their ways of combining choice attributes and each way is susceptible to different types of distractor effect.” (Lines 260-274)

      Reviewer #1 Comment 4

      The authors did careful analyses on quantifying the "distractor effect". While I fully agree that it is important to use the matched two-option trials and examine the interaction terms (DV-HV)T as a control, the interpretation of the results becomes tricky when looking at the effects in each trial type. Figure 2c shows a positive DV-HV effect in two-option trials whereas the DV-HV effect was not significantly stronger in three-option trials. Further in Figure 5b,c, in the Multiplicative group, the effect of DV-HV was absent in the two-option trials and present in the three-option trials. In the Additive group, however, the effect of DV-HV was significantly positive in the two-option trials but was significantly lowered in the three-option trials. Hence, it seems the different distractor effects were driven by the different effects of DV-HV in the two-option trials, rather than the three-option trials?

      We thank the reviewer for the comment. While it may be a bit more difficult to interpret, the current method of examining the (DV−HV)T term rather than (DV−HV) term was used because it was the approach used in a previous study (Cao & Tsetsos, 2022).

      During the design of the original experiments, trials were generated pseudo-randomly until the DV was sufficiently decorrelated from HV−LV. While this method allows for better group-level examination of behaviour, Cao and Tsetsos were concerned that this approach may have introduced unintended confounding covariations to some trials. In theory, one of the unintended covariations could occur between the DV and specific sets of reward magnitude and probability of the HV and LV. The covariation between parameters can lead to an observable positive distractor effect in the DV−HV as a consequence of the attraction effect or an unintended byproduct of using an additive method of integrating attributes [for further elaboration, please refer to Figure 1 in (Cao & Tsetsos, 2022)]. While it may have some limitations, the approach suggested by Cao and Tsetsos has the advantage of leveraging the DV−HV term to absorb any variance contributed by possible confounding factors such that true distractor effects, if any, can be detected using the (DV−HV)T term.

      Reviewer #1 Comment 5

      Note that the pattern described above was different in Supplementary Figure 2, where the effect of DV-HV on the two-option trials was negative for both Multiplicative and Additive groups. I would suggest considering using Supplementary Figure 2 as the main result instead of Figure 5, as it does not rely on multiplicative EV to measure the distraction effect, and it shows the same direction of DV-HV effect on two-option trials, providing a better basis to interpret the (DV-HV)T effect.

      We thank the reviewer for the comments and suggestion. However, as mentioned in the response to Reviewer #1 Comment 4, the current method of analysis adopted in the manuscript and the interpretation of only (DV−HV)T is aimed to address the possibility that the (DV−HV) term may be capturing some confounding effects due to covariation. Given that the debate that is addressed specifically concerns the (DV−HV)T term, we elected to display Figure 5 within the main text and keep the results of the regression after replacing the utility function with the composite model as Supplementary Figure 5 (previously labelled as Supplementary Figure 2).

      Reviewer #2 (Public Review):

      This paper addresses the empirical demonstration of "distractor effects" in multi-attribute decision-making. It continues a debate in the literature on the presence (or not) of these effects, which domains they arise in, and their heterogeneity across subjects. The domain of the study is a particular type of multi-attribute decision-making: choices over risky lotteries. The paper reports a re-analysis of lottery data from multiple experiments run previously by the authors and other laboratories involved in the debate.

      Methodologically, the analysis assumes a number of simple forms for how attributes are aggregated (adaptively, multiplicatively, or both) and then applies a "reduced form" logistic regression to the choices with a number of interaction terms intended to control for various features of the choice set. One of these interactions, modulated by ternary/binary treatment, is interpreted as a "distractor effect."

      The claimed contribution of the re-analysis is to demonstrate a correlation in the strength/sign of this treatment effect with another estimated parameter: the relative mixture of additive/multiplicative preferences.

      We thank the reviewer for the positive response and are pleased that the reviewer found our report interesting.

      Reviewer #2 Comment 1

      Major Issues

      (1) How to Interpret GLM 1 and 2

      This paper, and others before it, have used a binary logistic regression with a number of interaction terms to attempt to control for various features of the choice set and how they influence choice. It is important to recognize that this modelling approach is not derived from a theoretical claim about the form of the computational model that guides decision-making in this task, nor an explicit test for a distractor effect. This can be seen most clearly in the equations after line 321 and its corresponding log-likelihood after 354, which contain no parameter or test for "distractor effects". Rather the computational model assumes a binary choice probability and then shoehorns the test for distractor effects via a binary/ternary treatment interaction in a separate regression (GLM 1 and 2). This approach has already led to multiple misinterpretations in the literature (see Cao & Tsetsos, 2022; Webb et al., 2020). One of these misinterpretations occurred in the datasets the authors studied, in which the lottery stimuli contained a confound with the interaction that Chau et al., (2014) were interpreting as a distractor effect (GLM 1). Cao & Tsetsos (2022) demonstrated that the interaction was significant in binary choice data from the study, therefore it can not be caused by a third alternative. This paper attempts to address this issue with a further interaction with the binary/ternary treatment (GLM 2). Therefore the difference in the interaction across the two conditions is claimed to now be the distractor effect. The validity of this claim brings us to what exactly is meant by a "distractor effect."

      The paper begins by noting that "Rationally, choices ought to be unaffected by distractors" (line 33). This is not true. There are many normative models that allow for the value of alternatives (even low-valued "distractors") to influence choices, including a simple random utility model. Since Luce (1959), it has been known that the axiom of "Independence of Irrelevant Alternatives" (that the probability ratio between any two alternatives does not depend on a third) is an extremely strong axiom, and only a sufficiency axiom for a random utility representation (Block and Marschak, 1959). It is not a necessary condition of a utility representation, and if this is our definition of rational (which is highly debatable), not necessary for it either. Countless empirical studies have demonstrated that IIA is falsified, and a large number of models can address it, including a simple random utility model with independent normal errors (i.e. a multivariate Probit model). In fact, it is only the multinomial Logit model that imposes IIA. It is also why so much attention is paid to the asymmetric dominance effect, which is a violation of a necessary condition for random utility (the Regularity axiom).

      So what do the authors even mean by a "distractor effect." It is true that the form of IIA violations (i.e. their path through the probability simplex as the low-option varies) tells us something about the computational model underlying choice (after all, different models will predict different patterns). However we do not know how the interaction terms in the binary logit regression relate to the pattern of the violations because there is no formal theory that relates them. Any test for relative value coding is a joint test of the computational model and the form of the stochastic component (Webb et al, 2020). These interaction terms may simply be picking up substitution patterns that can be easily reconciled with some form of random utility. While we can not check all forms of random utility in these datasets (because the class of such models is large), this paper doesn't even rule any of these models out.

      We thank the reviewer for the comment. In this study, one objective is to address an issue raised by Cao and Tsetsos (2022), suggesting that the distractor effect claimed in the Chau et al. (2014) study was potentially confounded by unintended correlation introduced between the distractor and the chooseable options. They suggested that this could be tested by analyzing the control binary trials and the experimental ternary trials in a single model (i.e., GLM2) and introducing an interaction term (DV−HV)T. The interaction term can partial out any unintended confound and test the distractor effect that was present specifically in the experimental ternary trials. We adopted these procedures in our current studies and employed the interaction term to test the distractor effects. The results showed that overall there was no significant distractor effect in the group. We agree with the reviewer’s comment that if we were only analysing the ternary trials, a multinomial probit model would be suitable because it allows noise correlation between the choices. Alternatively, had a multinomial logistic model been applied, a Hausman-McFadden Test could be run to test whether the data violates the assumption of independence of irrelevant alternatives (IIA). However, in our case, a binomial model is preferred over a multinomial model because of: (1) the inclusion of the binary trials, and (2) the small number of trials in which the distractor was chosen (the median was 4% of all ternary trials).

      However, another main objective of this study is to consider the possibility that the precise distractor effect may vary across individuals. This is exactly why we employed the composite model to estimate individual’s decision making strategy and investigated how that varied with the precise way the distractor influenced decision making.

      In addition, we think that the reviewer here is raising a profound point and one with which we are in sympathy; it is true that random noise utility models can predict deviations from the IIA axiom. Central to these approaches is the notion that the representations of the values of choice options are noisy. Thus, when the representation is accessed, it might have a certain value on average but this value might vary from occasion to occasion as if each sample were being drawn from a distribution. As a consequence, the value of a distractor that is “drawn” during a decision between two other options may be larger than the distractor’s average value and may even have a value that is larger than the value drawn from the less valuable choice option’s distribution on the current trial. On such a trial it may become especially clear that the better of the two options has a higher value than the alternative choice option. Our understanding is that Webb, Louie and colleagues (Louie et al., 2013; Webb et al., 2020) suggest an explanation approximately along these lines when they reported a negative distractor effect during some decisions, i.e., they follow the predictions of divisive normalization suggesting that decisions become more random as the distractor’s value is greater.

      An alternative approach, however, assumes that rather than noise in the representation of the option itself, there is noise in the comparison process when the two options are compared. This is exemplified in many influential decision making models including evidence accumulation models such as drift diffusion models (Shadlen & Shohamy, 2016) and recurrent neural network models of decision making (Wang, 2008). It is this latter type of model that we have used in our previous investigations (Chau et al., 2020; Kohl et al., 2023). However, these two approaches are linked both in their theoretical origin and in the predictions that they make in many situations (Shadlen & Shohamy, 2016). We therefore clarify that this is the case in the revised manuscript as follows:

      “In the current study and in previous work we have used or made reference to models of decision making that assume that a noisy process of choice comparison occurs such as recurrent neural networks and drift diffusion models (Shadlen & Shohamy, 2016; Wang, 2008). Under this approach, positive distractor effects are predicted when the comparison process becomes more accurate because of an impact on the noisy process of choice comparison (Chau et al., 2020; Kohl et al., 2023). However, it is worth noting that another class of models might assume that a choice representation itself is inherently noisy. According to this approach, on any given decision a sample is drawn from a distribution of value estimates in a noisy representation of the option. Thus, when the representation is accessed, it might have a certain value on average but this value might vary from occasion to occasion. As a consequence, the value of a distractor that is “drawn” during decision between two other options may be larger than the distractor’s average value and may even have a value that is larger than the value drawn from the less valuable choice option’s distribution on the current trial. On such a trial it may become especially clear that the better of the two options has a higher value than the alternative choice option. Louie and colleagues (Louie et al., 2013) suggest an explanation approximately along these lines when they reported a positive distractor effect during some decisions. Such different approaches share theoretical origins (Shadlen & Shohamy, 2016) and make related predictions about the impact of distractors on decision making.” (Lines 297-313)

      Reviewer #2 Comment 2

      (2) How to Interpret the Composite (Mixture) model?

      On the other side of the correlation are the results from the mixture model for how decision-makers aggregate attributes. The authors report that most subjects are best represented by a mixture of additive and multiplicative aggregation models. The authors justify this with the proposal that these values are computed in different brain regions and then aggregated (which is reasonable, though raises the question of "where" if not the mPFC). However, an equally reasonable interpretation is that the improved fit of the mixture model simply reflects a misspecification of two extreme aggregation processes (additive and EV), so the log-likelihood is maximized at some point in between them.

      One possibility is a model with utility curvature. How much of this result is just due to curvature in valuation? There are many reasonable theories for why we should expect curvature in utility for human subjects (for example, limited perception: Robson, 2001, Khaw, Li Woodford, 2019; Netzer et al., 2022) and of course many empirical demonstrations of risk aversion for small stakes lotteries. The mixture model, on the other hand, has parametric flexibility.

      There is also a large literature on testing expected utility jointly with stochastic choice, and the impact of these assumptions on parameter interpretation (Loomes & Sugden, 1998; Apesteguia & Ballester, 2018; Webb, 2019). This relates back to the point above: the mixture may reflect the joint assumption of how choice departs from deterministic EV.

      We thank the reviewer for the comment. They are indeed right to mention the vast literature on curvature in subjective valuation; however it is important to stress that the predictions of the additive model with linear basis functions are quite distinct for the predictions of a multiplicative model with non-linear basis functions. We have tested the possibility that participants’ behaviour was better explained by the latter and we showed that this was not the case. Specifically, we have added and performed model fitting on an additional model with utility curvature based on prospect theory (Kahneman & Tversky, 1979) with the weighted probability function suggested by (Prelec, 1998):

      where  and  represent the reward magnitude and probability (both rescaled to the interval between 0 and 1), respectively.  is the weighted magnitude and  is the weighted probability, while  and  are the corresponding distortion parameters. This prospect theory (PT) model is included along with the four previous models (please refer to Figure 3) in a Bayesian model comparison. Results indicate that the composite model remains the best account of participants’ choice behaviour (exceedance probability = 1.000, estimated model frequency = 0.720). We have now included these results in the main text and Supplementary Figure 2:

      “Supplementary Figure 2 reports an additional Bayesian model comparison performed while including a model with nonlinear utility functions based on Prospect Theory (Kahneman & Tversky, 1979) with the Prelec formula for probability (Prelec, 1998). Consistent with the above finding, the composite model provides the best account of participants’ choice behaviour (exceedance probability = 1.000, estimated model frequency = 0.720).” (Lines 193-198)

      Reviewer #2 Comment 3

      3) So then how should we interpret the correlation that the authors report?

      On one side we have the impact of the binary/ternary treatment which demonstrates some impact of the low value alternative on a binary choice probability. This may reflect some deep flaws in existing theories of choice, or it may simply reflect some departure from purely deterministic expected value maximization that existing theories can address. We have no theory to connect it to, so we cannot tell. On the other side of the correlation, we have a mixture between additive and multiplicative preferences over risk. This result may reflect two distinct neural processes at work, or it may simply reflect a misspecification of the manner in which humans perceive and aggregate attributes of a lottery (or even just the stimuli in this experiment) by these two extreme candidates (additive vs. EV). Again, this would entail some departure from purely deterministic expected value maximization that existing theories can address.

      It is entirely possible that the authors are reporting a result that points to the more exciting of these two possibilities. But it is also possible (and perhaps more likely) that the correlation is more mundane. The paper does not guide us to theories that predict such a correlation, nor reject any existing ones. In my opinion, we should be striving for theoretically-driven analyses of datasets, where the interpretation of results is clearer.

      We thank the reviewer for their clear comments. Based on our responses to the previous comments it should be apparent that our results are consistent with several existing theories of choice, so we are not claiming that there are deep flaws in them, but distinct neural processes (additive and multiplicative) are revealed, and this does not reflect a misspecification in the modelling. We have revised our manuscript in the light of the reviewer’s comments in the hope of clarifying the theoretical background which informed both our data analysis and our data interpretation.

      First, we note that there are theoretical reasons to expect a third option might impact on choice valuation. There is a large body of work suggesting that a third option may have an impact on the values of two other options (indeed Reviewer #2 refers to some of this work in their Reviewer #2 Comment 1), but the body of theoretical work originates partly in neuroscience and not just in behavioural economics. In many sensory systems, neural activity changes with the intensity of the stimuli that are sensed. Divisive normalization in sensory systems, however, describes the way in which such neural responses are altered also as a function of other adjacent stimuli (Carandini & Heeger, 2012; Glimcher, 2022; Louie et al., 2011, 2013). The phenomenon has been observed at neural and behavioural levels as a function not just of the physical intensity of the other stimuli but as a function of their associated value (Glimcher, 2014, 2022; Louie et al., 2011, 2015; Noonan et al., 2017; Webb et al., 2020).

      Analogously there is an emerging body of work on the combinatorial processes that describe how multiple representational elements are integrated into new representations (Barron et al., 2013; Papageorgiou et al., 2017; Schwartenbeck et al., 2023). These studies have originated in neuroscience, just as was the case with divisive normalization, but they may have implications for understanding behaviour. For example, they might be linked to behavioural observations that the values assigned to bundles of goods are not necessarily the sum of the values of the individual goods (Hsee, 1998; List, 2002). One neuroscience fact that we know about such processes is that, at an anatomical level, they are linked to the medial frontal cortex (Barron et al., 2013; Fellows, 2006; Hunt et al., 2012; Papageorgiou et al., 2017; Schwartenbeck et al., 2023). A second neuroscientific fact that we know about medial frontal cortex is that it is linked to any positive effects that distractors might have on decision making (Chau et al., 2014; Noonan et al., 2017). Therefore, we might make use of these neuroscientific facts and theories to predict a correlation between positive distractor effects and non-additive mechanisms for determining the integrated value of multi-component choices. This is precisely what we did; we predicted the correlation on the basis of this body of work and when we tested to see if it was present, we found that indeed it was. It may be the case that other behavioural economics theories offer little explanation of the associations and correlations that we find. However, we emphasize that this association is predicted by neuroscientific theory and in the revised manuscript we have attempted to clarify this in the Introduction and Discussion sections:

      “Given the overlap in neuroanatomical bases underlying the different methods of value estimation and the types of distractor effects, we further explored the relationship. Critically, those who employed a more multiplicative style of integrating choice attributes also showed stronger positive distractor effects, whereas those who employed a more additive style showed negative distractor effects. These findings concur with neural data demonstrating that the medial prefrontal cortex (mPFC) computes the overall values of choices in ways that go beyond simply adding their components together, and is the neural site at which positive distractor effects emerge (Barron et al., 2013; Bongioanni et al., 2021; Chau et al., 2014; Fouragnan et al., 2019; Noonan et al., 2017; Papageorgiou et al., 2017), while divisive normalization was previously identified in the posterior parietal cortex (PPC) (Chau et al., 2014; Louie et al., 2011).” (Lines 109-119)

      “At the neuroanatomical level, the negative distractor effect is mediated by the PPC, where signal modulation described by divisive normalization has been previously identified (Chau et al., 2014; Louie et al., 2011). The same region is also crucial for perceptual decision making processes (Shadlen & Shohamy, 2016). The additive heuristics for combining choice attributes are closer to a perceptual evaluation because distances in this subjective value space correspond linearly to differences in physical attributes of the stimuli, whereas normative (multiplicative) value has a non-linear relation with them (cf. Figure 1c). It is well understood that many sensory mechanisms, such as in primates’ visual systems or fruit flies’ olfactory systems, are subject to divisive normalization (Carandini & Heeger, 2012). Hence, the additive heuristics that are more closely based on sensory mechanisms could also be subject to divisive normalization, leading to negative distractor effects in decision making.

      In contrast, the positive distractor effect is mediated by the mPFC (Chau et al., 2014; Fouragnan et al., 2019). Interestingly, the same or adjacent, interconnected mPFC regions have also been linked to the mechanisms by which representational elements are integrated into new representations (Barron et al., 2013; Klein-Flügge et al., 2022; Law et al., 2023; Papageorgiou et al., 2017; Schwartenbeck et al., 2023). In a number of situations, such as multi-attribute decision making, understanding social relations, and abstract knowledge, the mPFC achieves this by using a spatial map representation characterised by a grid-like response (Constantinescu et al., 2016; Bongioanni et al., 2021; Park et al., 2021) and disrupting mPFC leads to the evaluation of composite choice options as linear functions of their components (Bongioanni et al., 2021). These observations suggest a potential link between positive distractor effects and mechanisms for evaluating multiple component options and this is consistent with the across-participant correlation that we observed between the strength of the positive distractor effect and the strength of non-additive (i.e., multiplicative) evaluation of the composite stimuli we used in the current task. Hence, one direction for model development may involve incorporating the ideas that people vary in their ways of combining choice attributes and each way is susceptible to different types of distractor effect.” (Lines 250-274)

      Reviewer #2 Comment 4

      (4) Finally, the results from these experiments might not have external validity for two reasons. First, the normative criterion for multi-attribute decision-making differs depending on whether the attributes are lotteries or not (i.e. multiplicative vs additive). Whether it does so for humans is a matter of debate. Therefore if the result is unique to lotteries, it might not be robust for multi-attribute choice more generally. The paper largely glosses over this difference and mixes literature from both domains. Second, the lottery information was presented visually and there is literature suggesting this form of presentation might differ from numerical attributes. Which is more ecologically valid is also a matter of debate.

      We thank the reviewer for the comment. Indeed, they are right that the correlation we find between value estimation style and distractor effects may not be detected in all contexts of human behaviour. What the reviewer suggests goes along the same lines as our response to Reviewer #1 Comment 3, multi-attribute value estimation may have different structure: in some cases, the optimal solution may require a non-linear (e.g., multiplicative) response as in probabilistic or delayed decisions, but other cases (e.g., when estimating the value of a snack based on its taste, size, healthiness, price) a linear integration would suffice. In the latter kind of scenarios, both the optimal and the heuristic solutions may be additive and people’s value estimation “style” may not be teased apart. However, if different neural mechanisms associated with difference estimation processes are observed in certain scenarios, it suggests that these mechanisms are always present, even in scenarios where they do not alter the predictions. Probabilistic decision-making is also pervasive in many aspects of daily life and not just limited to the case of lotteries.

      While behaviour has been found to differ depending on whether lottery information is presented graphically or numerically, there is insufficient evidence to suggest biases towards additive or multiplicative evaluation, or towards positive or negative distractor effects. As such, we may expect that the correlation that we reveal in this paper, grounded in distinct neural mechanisms, would still hold even under different circumstances.

      Taking previous literature as examples, similar patterns of behaviour have been observed in humans when making decisions during trinary choice tasks. In a study conducted by Louie and colleagues (Louie et al., 2013; Webb et al., 2020), human participants performed a snack choice task where their behaviour could be modelled by divisive normalization with biphasic response (i.e., both positive and negative distractor effects). While these two studies only use a single numerical value of price for behavioural modelling, these prices should originate from an internal computation of various attributes related to each snack that are not purely related to lotteries. Expanding towards the social domain, studies of trinary decision making have considered face attractiveness and averageness (Furl, 2016), desirability of hiring (Chang et al., 2019), as well as desirability of candidates during voting (Chang et al., 2019). These choices involve considering various attributes unrelated to lotteries or numbers and yet, still display a combination of positive distractor and negative distractor (i.e. divisive normalization) effects, as in the current study. In particular, the experiments carried out by Chang and colleagues (Chang et al., 2019) involved decisions in a social context that resemble real-world situations. These findings suggests that both types of distractor effects can co-exist in other value based decision making tasks (Li et al., 2018; Louie et al., 2013) as well as decision making tasks in social contexts (Chang et al., 2019; Furl, 2016).

      Reviewer #2 Comment 5

      Minor Issues:

      The definition of EV as a normative choice baseline is problematic. The analysis requires that EV is the normative choice model (this is why the HV-LV gap is analyzed and the distractor effect defined in relation to it). But if the binary/ternary interaction effect can be accounted for by curvature of a value function, this should also change the definition of which lottery is HV or LV for that subject!

      We thank the reviewer for the comment. While the initial part of the paper discussed results that were defined by the EV model, the results shown in Supplementary Figure 2 were generated by replacing the utility function based on values obtained by using the composite model. Here, we have also redefined the definition of HV or LV for each subject depending on the updated value generated by the composite model prior to the regression.

      References

      Apesteguia, J. & Ballester, M. Monotone stochastic choice models: The case of risk and time preferences. Journal of Political Economy (2018).

      Block, H. D. & Marschak, J. Random Orderings and Stochastic Theories of Responses. Cowles Foundation Discussion Papers (1959).

      Khaw, M. W., Li, Z. & Woodford, M. Cognitive Imprecision and Small-Stakes Risk Aversion. Rev. Econ. Stud. 88, 1979-2013 (2020).

      Loomes, G. & Sugden, R. Testing Different Stochastic Specificationsof Risky Choice. Economica 65, 581-598 (1998).

      Luce, R. D. Indvidual Choice Behaviour. (John Wiley and Sons, Inc., 1959).

      Netzer, N., Robson, A. J., Steiner, J. & Kocourek, P. Endogenous Risk Attitudes. SSRN Electron. J. (2022) doi:10.2139/ssrn.4024773.

      Robson, A. J. Why would nature give individuals utility functions? Journal of Political Economy 109, 900-914 (2001).

      Webb, R. The (Neural) Dynamics of Stochastic Choice. Manage Sci 65, 230-255 (2019).

      Reviewer #3 (Public Review):

      Summary:

      The way an unavailable (distractor) alternative impacts decision quality is of great theoretical importance. Previous work, led by some of the authors of this study, had converged on a nuanced conclusion wherein the distractor can both improve (positive distractor effect) and reduce (negative distractor effect) decision quality, contingent upon the difficulty of the decision problem. In very recent work, Cao and Tsetsos (2022) reanalyzed all relevant previous datasets and showed that once distractor trials are referenced to binary trials (in which the distractor alternative is not shown to participants), distractor effects are absent. Cao and Tsetsos further showed that human participants heavily relied on additive (and not multiplicative) integration of rewards and probabilities.

      The present study by Wong et al. puts forward a novel thesis according to which interindividual differences in the way of combining reward attributes underlie the absence of detectable distractor effect at the group level. They re-analysed the 144 human participants and classified participants into a "multiplicative integration" group and an "additive integration" group based on a model parameter, the "integration coefficient", that interpolates between the multiplicative utility and the additive utility in a mixture model. They report that participants in the "multiplicative" group show a negative distractor effect while participants in the "additive" group show a positive distractor effect. These findings are extensively discussed in relation to the potential underlying neural mechanisms.

      Strengths:

      - The study is forward-looking, integrating previous findings well, and offering a novel proposal on how different integration strategies can lead to different choice biases.

      - The authors did an excellent job of connecting their thesis with previous neural findings. This is a very encompassing perspective that is likely to motivate new studies towards a better understanding of how humans and other animals integrate information in decisions under risk and uncertainty.

      - Despite that some aspects of the paper are very technical, methodological details are well explained and the paper is very well written.

      We thank the reviewer for the positive response and are pleased that the reviewer found our report interesting.

      Reviewer #3 Comment 1

      Weaknesses:

      The authors quantify the distractor variable as "DV - HV", i.e., the relative distractor variable. Do the conclusions hold when the distractor is quantified in absolute terms (as "DV", see also Cao & Tsetsos, 2023)? Similarly, the authors show in Suppl. Figure 1 that the inclusion of a HV + LV regressor does not alter their conclusions. However, the (HV + LV)*T regressor was not included in this analysis. Does including this interaction term alter the conclusions considering there is a high correlation between (HV + LV)*T and (DV - HV)*T? More generally, it will be valuable if the authors assess and discuss the robustness of their findings across different ways of quantifying the distractor effect.

      We thank the reviewer for the comment. In the original manuscript we had already demonstrated that the distractor effect was related to the integration coefficient using a number of complementary analyses. They include Figure 5 based on GLM2, Supplementary Figure 3 based on GLM3 (i.e., adding the HV+LV term to GLM2), and Supplementary Figure 4 based on GLM2 but applying the utility estimate from the composite model instead of expected value (EV). These three sets of analyses produced comparable results. The reason why we elected not to include the (HV+LV)T term in GLM3 (Supplementary Figure 3) was due to the collinearity between the regressors in the GLM. If this term is included in GLM3, the variance inflation factor (VIF) would exceed an acceptable level of 4 for some regressors. In particular, the VIF for the (HV+LV) and (HV+LV)T regressors is 5.420, while the VIF for (DV−HV) and (DV−HV)T is 4.723.

      Here, however, we consider the additional analysis suggested by the reviewer and test whether similar results are obtained. We constructed GLM4 including the (HV+LV)T term but replacing the relative distractor value (DV-HV) with the absolute distractor value (DV) in the main term and its interactions, as follows:

      GLM4:

      A significant negative (DV)T effect was found for the additive group [t(72)=−2.0253, p=0.0465] while the multiplicative group had a positive trend despite not reaching significance. Between the two groups, the (DV)T term was significantly different [t(142)=2.0434, p=0.0429]. While these findings suggest that the current conclusions could be partially replicated, simply replacing the relative distractor value with the absolute value in the previous analyses resulted in non-significant findings. Taking these results together with the main findings, it is possible to conclude that the positive distractor effect is better captured using the relative DV-HV term rather than the absolute DV term. This would be consistent with the way in which option values are envisaged to interact with one another in the mutual inhibition model (Chau et al., 2014, 2020) that generates the positive distractor effect. The model suggests that evidence is accumulated as the difference between the excitatory input from the option (e.g. the HV option) and the pooled inhibition contributed partly by the distractor. We have now included these results in the manuscript:

      “Finally, we performed three additional analyses that revealed comparable results to those shown in Figure 5. In the first analysis, reported in Supplementary Figure 3, we added an  term to the GLM, because this term was included in some analyses of a previous study that used the same dataset (Chau et al., 2020). In the second analysis, we added an  term to the GLM. We noticed that this change led to inflation of the collinearity between the regressors and so we also replaced the (DV−HV) term by the DV term to mitigate the collinearity (Supplementary Figure 4). In the third analyses, reported in Supplementary Figure 5, we replaced the utility terms of GLM2. Since the above analyses involved using HV, LV, and DV values defined by the normative Expected Value model, here, we re-defined the values using the composite model prior to applying GLM2. Overall, in the Multiplicative Group a significant positive distractor effect was found in Supplementary Figures 3 and 4. In the Additive Group a significant negative distractor effect was found in Supplementary Figures 3 and 5. Crucially, all three analyses consistently showed that the distractor effects were significantly different between the Multiplicative Group and the Additive Group.” (Lines 225-237)

      Reviewer #3 Comment 2

      The central finding of this study is that participants who integrate reward attributes multiplicatively show a positive distractor effect while participants who integrate additively show a negative distractor effect. This is a very interesting and intriguing observation. However, there is no explanation as to why the integration strategy covaries with the direction of the distractor effect. It is unlikely that the mixture model generates any distractor effect as it combines two "context-independent" models (additive utility and expected value) and is fit to the binary-choice trials. The authors can verify this point by quantifying the distractor effect in the mixture model. If that is the case, it will be important to highlight that the composite model is not explanatory; and defer a mechanistic explanation of this covariation pattern to future studies.

      We thank the reviewer for the comment. Indeed, the main purpose of applying the mixture model was to identify the way each participants combined attributes and, as the reviewer pointed out, the mixture model per se is context independent. While we acknowledge that the mixture model is not a mechanistic explanation, there is a theoretical basis for the observation that these two factors are linked.

      Firstly, studies that have examined the processes involved when humans combine and integrate different elements to form new representations (Barron et al., 2013; Papageorgiou et al., 2017; Schwartenbeck et al., 2023) have implicated the medial frontal cortex as a crucial region (Barron et al., 2013; Fellows, 2006; Hunt et al., 2012; Papageorgiou et al., 2017; Schwartenbeck et al., 2023). Meanwhile, previous studies have also identified that positive distractor effects are linked to the medial frontal cortex (Chau et al., 2014; Noonan et al., 2017). Therefore, the current study utilized these two facts to establish the basis for a correlation between positive distractor effects and non-additive mechanisms for determining the integrated value of multi-component choices. Nevertheless, we agree with the reviewer that it will be an important future direction to look at how the covariation pattern emerges in a computational model. We have revised the manuscript in an attempt to address this issue.

      “At the neuroanatomical level, the negative distractor effect is mediated by the PPC, where signal modulation described by divisive normalization has been previously identified (Chau et al., 2014; Louie et al., 2011). The same region is also crucial for perceptual decision making processes (Shadlen & Shohamy, 2016). The additive heuristics for combining choice attributes are closer to a perceptual evaluation because distances in this subjective value space correspond linearly to differences in physical attributes of the stimuli, whereas normative (multiplicative) value has a non-linear relation with them (cf. Figure 1c). It is well understood that many sensory mechanisms, such as in primates’ visual systems or fruit flies’ olfactory systems, are subject to divisive normalization (Carandini & Heeger, 2012). Hence, the additive heuristics that are more closely based on sensory mechanisms could also be subject to divisive normalization, leading to negative distractor effects in decision making.

      In contrast, the positive distractor effect is mediated by the mPFC (Chau et al., 2014; Fouragnan et al., 2019). Interestingly, the same or adjacent, interconnected mPFC regions have also been linked to the mechanisms by which representational elements are integrated into new representations (Barron et al., 2013; Klein-Flügge et al., 2022; Law et al., 2023; Papageorgiou et al., 2017; Schwartenbeck et al., 2023). In a number of situations, such as multi-attribute decision making, understanding social relations, and abstract knowledge, the mPFC achieves this by using a spatial map representation characterised by a grid-like response (Constantinescu et al., 2016; Bongioanni et al., 2021; Park et al., 2021) and disrupting mPFC leads to the evaluation of composite choice options as linear functions of their components (Bongioanni et al., 2021). These observations suggest a potential link between positive distractor effects and mechanisms for evaluating multiple component options and this is consistent with the across-participant correlation that we observed between the strength of the positive distractor effect and the strength of non-additive (i.e., multiplicative) evaluation of the composite stimuli we used in the current task. Hence, one direction for model development may involve incorporating the ideas that people vary in their ways of combining choice attributes and each way is susceptible to different types of distractor effect.” (Lines 250-274)

      Reviewer #3 Comment 3

      -  Correction for multiple comparisons (e.g., Bonferroni-Holm) was not applied to the regression results. Is the "negative distractor effect in the Additive Group" (Fig. 5c) still significant after such correction? Although this does not affect the stark difference between the distractor effects in the two groups (Fig. 5a), the classification of the distractor effect in each group is important (i.e., should future modelling work try to capture both a negative and a positive effect in the two integration groups? Or just a null and a positive effect?).

      We thank the reviewer for the comment. We have performed Bonferroni-Holm correction and as the reviewer surmised, the negative distractor effect in the additive group becomes non-significant. However, we have to emphasize that our major claim is that there was a covariation between decision strategy (of combining attributes) and distractor effect (as seen in Figure 4). That analysis does not imply multiple comparisons. The analysis in Figure 5 that splits participants into two groups was mainly designed to illustrate the effects for an easier understanding by a more general audience. In many cases, the precise ways in which participants are divided into subgroups can have a major impact on whether each individual group’s effects are significant or not. It may be possible to identify an optimal way of grouping, but we refrained from taking such a trial-and-error approach, especially for the analysis in Figure 5 that simply supplements the point made in Figure 4. The key notion we would like the readers to take away is that there is a spectrum of distractor effects (ranging from negative to positive) that will vary depending on how the choice attributes were integrated.

      Reviewer #1 (Recommendations For The Authors):

      Reviewer #1 Recommendations 1

      Enhancements are necessary for the quality of the scientific writing. Several sentences have been written in a negligent manner and warrant revision to ensure a higher level of rigor. Moreover, a number of sentences lack appropriate citations, including but not restricted to:

      - Line 39-41.

      - Line 349-350 (also please clarify what it means by parameter estimate" is very accurate: correlation?).

      We thank the reviewer for the comment. We have made revisions to various parts of the manuscript to address the reviewer’s concerns.

      “Intriguingly, most investigations have considered the interaction between distractors and chooseable options either at the level of their overall utility or at the level of their component attributes, but not both (Chau et al., 2014, 2020; Gluth et al., 2018).” (Lines 40-42)

      “Additional simulations have shown that the fitted parameters can be recovered with high accuracy (i.e., with a high correlation between generative and recovered parameters).” (Lines 414-416)

      Reviewer #1 Recommendations 2

      Some other minor suggestions:

      - Correlative vs. Causality: the manuscript exhibits a lack of attentiveness in drawing causal conclusions from correlative evidence (manuscript title, Line 91, Line 153-155).

      - When displaying effect size on accuracy, there is no need to show the significance of intercept (Figure 2,5, & supplementary figures).

      - Adding some figure titles on Figure 2 so it is clear what each panel stands for.

      - In Figure 3, the dots falling on zero values are not easily seen. Maybe increasing the dot size a little?

      - Line 298: binomial linking function (instead of binomial distribution).

      - Line 100: composite, not compositive.

      - Line 138-139: please improve the sentence, if it's consistent with previous findings, what's the point of "surprisingly"?

      We thank the reviewer for the suggestions. We have made revisions to the title and various parts of the manuscript to address the reviewer’s concerns.

      - Correlative vs. Causality: the manuscript exhibits a lack of attentiveness in drawing causal conclusions from correlative evidence (manuscript title, Line 91, Line 153-155).

      We have now revised the manuscript:

      “Distractor effects in decision making are related to the individual’s style of integrating choice attributes” (title of the manuscript)

      “More particularly, we consider whether individual differences in combination styles could be related to different forms of distractor effect.” (Lines 99-100)

      “While these results may seem to suggest that a distractor effect was not present at an overall group level, we argue that the precise way in which a distractor affects decision making is related to how individuals integrate the attributes.” (Lines 164-167)

      - When displaying effect size on accuracy, there is no need to show the significance of intercept (Figure 2,5, & supplementary figures).

      We have also modified all Figures to remove the intercept.

      - Adding some figure titles on Figure 2 so it is clear what each panel stands for.

      We have added titles accordingly.

      - In Figure 3, the dots falling on zero values are not easily seen. Maybe increasing the dot size a little?

      In conjunction with addressing Reviewer #3 Recommendation 6, we have adapted the violin plots into histograms for a better representation of the values.

      - Line 298: binomial linking function (instead of binomial distribution).

      - Line 100: composite, not compositive.

      - Line 138-139: please improve the sentence, if it's consistent with previous findings, what's the point of "surprisingly"?

      We have made revisions accordingly.

      Reviewer #2 (Recommendations For The Authors):

      Reviewer #2 Recommendations 1

      Line 294. The definition of DV, HV, LV is not sufficient. Presumably, these are the U from the following sections? Or just EV? But this is not explicitly stated, rather they are vaguely referred to as values." The computational modelling section refers to them as utilities. Are these the same thing?

      We thank the reviewer for the suggestion. We have clarified that the exact method for calculating each of the values and updated the section accordingly.

      “where HV, LV, and DV refer to the values of the chooseable higher value option, chooseable lower value option, and distractor, respectively. Here, values (except those in Supplementary Figure 5) are defined as Expected Value (EV), calculated by multiplying magnitude and probability of reward.” (Lines 348-350)

      Reviewer #2 Recommendations 2

      The analysis drops trials in which the distractor was chosen. These trials are informative about the presence (or not) of relative valuation or other factors because they make such choices more (or less) likely. Ignoring them is another example of the analysis being misspecified.

      We thank the reviewer for the suggestion and this is related to Major Issue 1 raised by the same reviewer. In brief, we adopted the same methods implemented by Cao and Tsetsos (Cao and Tsetsos, 2022) and that constrained us to applying a binomial model. Please refer to our reply to Major Issue 1 for more details.

      Reviewer #2 Recommendations 3

      Some questions and suggestions on statistics and computational modeling:

      Have the authors looked at potential collinearity between the regressors in each of the GLMs?

      We thank the reviewer for the comment. For each of the following GLMs, the average variance inflation factor (VIF) has been calculated as follows:

      GLM2 using the Expected Value model:

      Author response table 1.

      GLM2 after replacing the utility function based on the normative Expected Value model with values obtained by using the composite model:

      Author response table 2.

      GLM3:

      Author response table 3.

      As indicated in the average VIF values calculated, none of them exceed 4, suggesting that the estimated coefficients were not inflated due to collinearity between the regressor in each of the GLMs.

      Reviewer #2 Recommendations 4

      - Correlation results in Figure 4. What is the regression line displayed on this plot? I suspect the regression line came from Pearson's correlation, which would be inconsistent with the Spearman's correlation reported in the text. A reasonable way would be to transform both x and y axes to the ranked data. However, I wonder why it makes sense to use ranked data for testing the correlation in this case. Those are both scalar values. Also, did the authors assess the influence of the zero integration coefficient on the correlation result? Importantly, did the authors redo the correlation plot after defining the utility function by the composite models?

      We thank the reviewer for the suggestion. The plotted line in Figure 4 was based on the Pearson’s correlation and we have modified the text to also report the Pearson’s correlation result as well.

      If we were to exclude the 32 participants with integration coefficients smaller than 1×10-6 from the analysis, we still observe a significant positive Pearson’s correlation [r(110)=0.202, p=0.0330].

      Author response image 1.

      Figure 4 after excluding 32 participants with integration coefficients smaller than 1×10-6.

      “As such, we proceeded to explore how the distractor effect (i.e., the effect of (DV−HV)T obtained from GLM2; Figure 2c) was related to the integration coefficient (η) of the optimal model via a Pearson’s correlation (Figure 4). As expected, a significant positive correlation was observed [r(142)=0.282, p=0.000631]. We noticed that there were 32 participants with integration coefficients that were close to zero (below 1×10-6). The correlation remained significant even after removing these participants [r(110)=0.202, p=0.0330].” (Lines 207-212)

      The last question relates to results already included in Supplementary Figure 5, in which the analyses were conducted using the utility function of the composite model. We notice that although there was a difference in integration coefficient between the multiplicative and additive groups, a correlational analysis did not generate significant results [r(142)=0.124, p=0.138]. It is possible that the relationship became less linear after applying the composite model utility function. However, it is noticeable that in a series of complementary analyses (Figure 5: r(142)=0.282, p=0.000631; Supplementary Figure 3: r(142)=0.278, p=0.000746) comparable results were obtained.

      Reviewer #2 Recommendations 5

      - From lines 163-165, were the models tested on only the three-option trials or both two and three-opinion trials? It is ambiguous from the description here. It might be worth checking the model comparison based on different trial types, and the current model fitting results do not tell an absolute sense of the goodness of fit. I would suggest including the correctly predicted trial proportions in each trial type from different models.

      We thank the reviewer for the suggestion. We have only modeled the two-option trials and the key reason for this is because the two-option trials can arguably provide a better estimate of participants’ style of integrating attributes as they are independent of any distractor effects. This was also the same reason why Cao and Tsetsos applied the same approach when they were re-analyzing our data (Cao and Tsetsos, 2022). We have clarified the statement accordingly.

      “We fitted these models exclusively to the Two-Option Trial data and not the Distractor Trial data, such that the fitting (especially that of the integration coefficient) was independent of any distractor effects, and tested which model best describes participants’ choice behaviours.” (Lines 175-178)

      Reviewer #2 Recommendations 6

      - Along with displaying the marginal distributions of each parameter estimate, a correlation plot of these model parameters might be useful, given that some model parameters are multiplied in the value functions.

      We thank the reviewer for the suggestion. We have also generated the correlation plot of the model parameters. The Pearson’s correlation between the magnitude/probability weighting and integration coefficient was significant [r(142)=−0.259, p=0.00170]. The Pearson’s correlation between the inverse temperature and integration coefficient was not significant [r(142)=−0.0301, p=0.721]. The Pearson’s correlation between the inverse temperature and magnitude/probability weighting was not significant [r(142)=−0.0715, p=0.394].

      “Our finding that the average integration coefficient  was 0.325 coincides with previous evidence that people were biased towards using an additive, rather than a multiplicative rule. However, it also shows rather than being fully additive ( =0) or multiplicative ( =1), people’s choice behaviour is best described as a mixture of both. Supplementary Figure 1 shows the relationships between all the fitted parameters.” (Lines 189-193)

      Reviewer #2 Recommendations 7

      Have the authors tried any functional transformations on amounts or probabilities before applying the weighted sum? The two attributes are on entirely different scales and thus may not be directly summed together.

      We thank the reviewer for the comment. Amounts and probabilities were indeed both rescaled to the 0-1 interval before being summed, as explained in the methods (Line XXX). Additionally, we have now added and performed model fitting on an additional model with utility curvature based on the prospect theory (Kahneman & Tversky, 1979) and a weighted probability function (Prelec, 1998):

      where  and  represent the reward magnitude and probability (both rescaled to the interval between 0 and 1), respectively.  is the weighted magnitude and  is the weighted probability, while  and  are the corresponding distortion parameters. This prospect theory (PT) model was included along with the four previous models (please refer to Figure 3) in a Bayesian model comparison. Results indicate that the composite model remains as the best account of participants’ choice behaviour (exceedance probability = 1.000, estimated model frequency = 0.720).

      “Supplementary Figure 2 reports an additional Bayesian model comparison performed while including a model with nonlinear utility functions based on Prospect Theory (Kahneman & Tversky, 1979) with the Prelec formula for probability (Prelec, 1998). Consistent with the above finding, the composite model provides the best account of participants’ choice behaviour (exceedance probability = 1.000, estimated model frequency = 0.720).” (Lines 193-198)

      Reviewer #3 (Recommendations For The Authors):

      Reviewer #3 Recommendations 1

      - In the Introduction (around line 48), the authors make the case that distractor effects can co-exist in different parts of the decision space, citing Chau et al. (2020). However, if the distractor effect is calculated relative to the binary baseline this is no longer the case.

      - Relating to the above point, it might be useful for the authors to make a distinction between effects being non-monotonic across the decision space (within individuals) and effects varying across individuals due to different strategies adopted. These two scenarios are conceptually distinct.

      We thank the reviewer for the comment. Indeed, the ideas that distractor effects may vary across decision space and across different individuals are slightly different concepts. We have now revised the manuscript to clarify this:

      “However, as has been argued in other contexts, just because one type of distractor effect is present does not preclude another type from existing (Chau et al., 2020; Kohl et al., 2023). Each type of distractor effect can dominate depending on the dynamics between the distractor and the chooseable options. Moreover, the fact that people have diverse ways of making decisions is often overlooked. Therefore, not only may the type of distractor effect that predominates vary as a function of the relative position of the options in the decision space, but also as a function of each individual’s style of decision making.” (Lines 48-54)

      Reviewer #3 Recommendations 2

      - The idea of mixture models/strategies has strong backing from other Cognitive Science domains and will appeal to most readers. It would be very valuable if the authors could further discuss the potential level at which their composite model might operate. Are the additive and EV quantities computed and weighted (as per the integration coefficient) within a trial giving rise to a composite decision variable? Or does the integration coefficient reflect a probabilistic (perhaps competitive) selection of one strategy on a given trial? Perhaps extant neural data can shed light on this question.

      We thank the reviewer for the comment. The idea is related to whether the observed mixture in integration models derives from value being actually computed in a mixed way within each trial, or each trial involves a probabilistic selection between the additive and multiplicative strategies. We agree that this is an interesting question and to address it would require the use of some independent continuous measures to estimate the subjective values in quantitative terms (instead of using the categorical choice data). This could be done by collecting pupil size data or functional magnetic resonance imaging data, as the reviewer has pointed out. Although the empirical work is beyond the scope of the current behavioural study, it is worth bringing up this point in the Discussion:

      “The current finding involves the use of a composite model that arbitrates between the additive and multiplicative strategies. A general question for such composite models is whether people mix two strategies in a consistent manner on every trial or whether there is some form of probabilistic selection occurring between the two strategies on each trial such that only one strategy is used on any given trial while, on average, one strategy is more probable than the other. To test which is the case requires an independent estimation of subjective values in quantitative terms, such as by pupillometry or functional neuroimaging. Further understanding of this problem will also provide important insight into the precise way in which distractor effects operate at the single-trial level.” (Lines 275-282)

      Reviewer #3 Recommendations 3

      Line 80 "compare pairs of attributes separately, without integration". This additive rule (or the within-attribute comparison) implies integration, it is just not multiplicative integration.

      We thank the reviewer for the comment. We have made adjustments to the manuscript to ensure that the message delivered within this manuscript is consistent.

      “For clarity, we stress that the same mathematical formula for additive value can be interpreted as meaning that 1) subjects first estimate the value of each option in an additive way (value integration) and then compare the options, or 2) subjects compare the two magnitudes and separately compare the two probabilities without integrating dimensions into overall values. On the other hand, the mathematical formula for multiplicative value is only compatible with the first interpretation. In this paper we focus on attribute combination styles (multiplicative vs additive) and do not make claims on the order of the operations. More particularly, we consider whether individual differences in combination styles could be related to different forms of distractor effect.” (Lines 92-100)

      Reviewer #3 Recommendations 4

      - Not clear why the header in line 122 is phrased as a question.

      We thank the reviewer for the suggestion. We have modified the header to the following:

      “The distractor effect was absent on average” (Line 129)

      Reviewer #3 Recommendations 5

      - The discussion and integration of key neural findings with the current thesis are outstanding. It might help the readers if certain statements such as "the distractor effect is mediated by the PPC" (line 229) were further unpacked.

      We thank the reviewer for the suggestion. We have made modifications to the original passage to further elaborate the statement.

      “At the neuroanatomical level, the negative distractor effect is mediated by the PPC, where signal modulation described by divisive normalization has been previously identified (Chau et al., 2014; Louie et al., 2011). The same region is also crucial for perceptual decision making processes (Shadlen & Shohamy, 2016).” (Lines 250-253)

      Reviewer #3 Recommendations 6

      - In Fig. 3c, there seem to be many participants having the integration coefficient close to 0 but the present violin plot doesn't seem to best reflect this highly skewed distribution. A histogram would be perhaps better here.

      We thank the reviewer for the suggestion. We have modified the descriptive plots to use histograms instead of violin plots.

      “Figures 3c, d and e show the fitted parameters of the composite model: , the integration coefficient determining the relative weighting of the additive and multiplicative value ( , ); , the magnitude/probability weighing ratio ( , ); and , the inverse temperature ( , ). Our finding that the average integration coefficient  was 0.325 coincides with previous evidence that people were biased towards using an additive, rather than a multiplicative rule.” (Lines 186-191)

    1. eLife assessment

      The authors have conducted a convincing study utilizing machine learning algorithms to construct a validated radiotherapy sensitivity score (NPC-RSS) for predicting radiosensitivity in nasopharyngeal carcinoma patients that will be useful in a translational/clinical setting for predicting the best radiotherapy route for patients. They have also explored the biological mechanisms underlying the relationship between NPC-RSS and radiotherapy response, thus implicating certain pathways that could be targeted to enhance radiotherapy response or prevent radio-resistance.

    2. Reviewer #1 (Public Review):

      Summary:<br /> In this study, the authors developed a novel radiotherapy sensitivity score (NPC-RSS) for nasopharyngeal carcinoma patients using machine learning algorithms. They identified 18 key genes associated with radiosensitivity and demonstrated that NPC-RSS could effectively predict radiotherapy response in both public and in-house datasets. Furthermore, they found that the key genes of NPC-RSS were closely related to immune characteristics, the expression of radiosensitivity-related genes, and signaling pathways involved in disease progression. The authors validated the consistency of expression of two key genes, SMARCA2 and CD9, with NPC-RSS in their own cell lines. They also showed that the radiosensitive group, classified by NPC-RSS, exhibited a more enriched and activated state of immune infiltration compared to the radioresistant group.

      Strengths:<br /> (1) The study employed a comprehensive approach by integrating multiple machine learning algorithms to develop a robust predictive model for radiotherapy sensitivity in nasopharyngeal carcinoma patients.<br /> (2) The predictive performance of NPC-RSS was validated using both public and in-house datasets, demonstrating its potential clinical applicability.<br /> (3) The authors conducted extensive analyses to investigate the biological mechanisms underlying the association between NPC-RSS and radiotherapy response, including immune characteristics, radiosensitivity-related gene expression, and relevant signaling pathways.<br /> (4) The consistency of key gene expression with NPC-RSS was validated in the authors' own cell lines, providing additional experimental evidence.

      Weaknesses:<br /> (1) The sample size of the in-house dataset used for training the model was relatively small (34 patients), which might limit the generalizability of the findings.<br /> (2) The authors did not perform functional experiments to directly validate the roles of the identified key genes in radiotherapy sensitivity, relying instead on associations with immune features and signaling pathways.<br /> (3) The study did not discuss the potential limitations of using machine learning algorithms, such as the risk of overfitting and the need for larger, diverse datasets for more robust model development and validation.

    3. Reviewer #2 (Public Review):

      Summary:<br /> This article, titled "A multi-gene predictive model for the radiation sensitivity of nasopharyngeal carcinoma based on machine learning," utilizes machine learning methods and transcriptomic data from nasopharyngeal carcinoma (NPC) patients to construct a biomarker called NPC-RSS that can predict the radiosensitivity of NPC patients. The authors further explore the biological mechanisms underlying the relationship between NPC-RSS and radiotherapy response in NPC patients. The main objective of this study is to guide the selection of radiotherapy strategies for NPC patients, thereby improving their clinical outcomes and prognosis.

      Strengths:<br /> (1) The combination of multiple machine learning algorithms and cross-validation was used to select the best predictive model for radiotherapy sensitivity from 71 differentially expressed genes, enhancing the robustness and reliability of the predictions.<br /> (2) Functional enrichment analysis revealed close associations between NPC-RSS key genes and immune characteristics, expression of radiotherapy sensitivity-related genes, and signaling pathways related to disease progression, providing a biological basis for NPC-RSS in predicting radiotherapy sensitivity.<br /> (3) Grouping NPC samples according to NPC-RSS showed that the radiotherapy-sensitive group exhibited a more enriched and activated state of immune infiltration compared to the radioresistant group. In single-cell samples, NPC-RSS was higher in the radiotherapy-sensitive group, with immune cells playing a dominant role. These results clarify the mechanism of NPC-RSS in predicting radiotherapy sensitivity from an immunological perspective.<br /> (4) The study used public datasets and in-house cohort data for validation, confirming the good predictive performance of NPC-RSS and increasing the credibility of the results.

      Limitation:<br /> (1) The study focuses on a specific type of nasopharyngeal carcinoma (NPC) and may not be generalizable to other subtypes or related head and neck cancers. The applicability of NPC-RSS to a broader range of patients and tumor types remains to be determined.<br /> (2) The study does not account for potential differences in radiotherapy protocols, doses, and techniques between the training and validation cohorts, which could influence the performance of the predictive model. Standardization of treatment parameters would be important for future validation studies.<br /> (3) The binary classification of patients into radiotherapy-sensitive and resistant groups may oversimplify the complex spectrum of treatment responses. A more granular stratification system that captures intermediate responses could provide more nuanced predictions and better guide personalized treatment decisions.<br /> (4) The study does not address the potential impact of other relevant factors, such as tumor stage, histological subtype, and concurrent chemotherapy, on the predictive performance of NPC-RSS. Incorporating these clinical variables into the model could enhance its accuracy and clinical utility.

    1. eLife assessment

      This study presents a valuable description of the cellular and transcriptional landscape of the tumor microenvironment in 27 gastric cancer (GC) patients based on their H. pylori status (HpGC, ex-HpGC, non-HpGC). The single-cell RNA sequencing dataset and computational analysis are convincing and provide a starting point that is of value for understanding H pylori-associated GC cell type composition, cell transitions, and mechanisms of response to therapy. The section correlating immunotherapy outcomes with GC cell type compositions from bulk RNAseq would have been strengthened by further comparing H. pylori GC versus non H. pylori GC.

    2. Reviewer #1 (Public Review):

      In this study, the authors conducted a single-cell RNA sequencing analysis of the cellular and transcriptional landscape of the gastric cancer tumor microenvironment, stratifying patients according to their H. pylori status into currently infected, previously infected, and non-infected patients. The authors comprehensively dissect various cellular compartments, including epithelial, stromal, and immune cells, and describe specific cell types and signatures to be associated with H. pylori infection, including i) inflammatory and EMT signatures in malignant epithelial cells, ii) inflammatory CAFs in stromal cells, iii) Angio-TAMs, TREM2+ TAMs, exhausted and suppressive T cells in immune cells. Looking at ligand-receptor interactions as well as correlations between cell type abundances, they suggest that iCAFs interact with immunosuppressive T cells via a NECTIN2-TIGIT axis, as well as Angio-TAMs through a VEGFA/B-VEGFR1 axis and thereby promote immune escape, tumor angiogenesis and resistance to immunotherapy.

      The authors conduct a comprehensive and thorough analysis of the complex tumor microenvironment of gastric cancer, both single-cell RNA sequencing data as well as the analysis seem of high quality and according to best practices. The authors validate their findings using external datasets, and include some prognostic value of the identified signatures and cell types. However, most of their conclusions throughout the manuscript are based on the comparison between HPGC and healthy controls, which is not a valid comparison to determine which of the phenotypes are specifically driven by HP infection, e.g. Tregs are high in all GC types, independent of HP status. The same holds true for TREM+ TAMs and iCAFs, which are higher in GC in general. This makes it very difficult to assess the actual HP-driven signatures and cell types. Also, when looking at the correlation/transcriptional differences across different cell types and cellular interactions, the authors do not explicitly define if they are looking at the whole dataset (including healthy controls?) or only at certain patients (HPGC?), which again makes it difficult to interpret the results.

      The authors aim to confirm some of their findings via immunofluorescence, which in principle is a great approach to validate their results. However, to be able to conclude that e.g. suppressive TIGIT+ T cells are located close to NECTIN2+ malignant epithelium and that this might facilitate immune escape in HPGC (Figure 4K), the authors should include stains that show that this is not the case in the other groups (nonHPGC, exHPGC and HC). The same holds true for Figure 5G.

      In summary, this study provides a valuable resource on the cellular and transcriptional heterogeneity of the tumor microenvironment in gastric cancers, distinguishing between positive, negative, and previously positive HP-infected gastric cancer patients. Given that HP is the main risk factor for gastric cancer development, the study provides valuable insights into HP-driven transcriptional signatures and how these might contribute to this increased risk, however, the study would highly benefit from a clearer and more stringent comparison between HPGC and nonHPGC.

    3. Reviewer #2 (Public Review):

      Summary:

      This study aims to describe the single-cell transcriptomes of H pylori-associated (Hp) gastric cancers and tumour microenvironment (TME), as a starting point to understand TME diversity stratified by Hp status.

      RNAseq was performed for gastric cancers with current Hp+ (from N=9 people), ex-Hp+ (N=6), non-Hp (N=6), and healthy gastric tissue (N=6).

      The study expands on previous single-cell transcriptomic studies of gastric cancers and was motivated by previous observations about the effect of H pylori status on therapeutic outcomes. The study includes a brief review of previous work and provides valuable context for this study.

      Strengths:

      The observations are supported by solid RNAseq study design and analysis. The authors describe correlations between Hp status and inferred molecular characteristics including cell lineages, enrichment for cell subclusters identified as tumour-infiltrating lyphocyte cell types, tumour-infiltrating myeloid cells, and cancer-associated fibroblasts.

      The observed correlations between Hp status and enrichment of cell subclusters were broadly corroborated using comparisons to deconvolved bulk RNAseq from publicly available gastric cancer data, providing a convincing starting point for understanding the diversity of tumour microenvironment by Hp-status.

      Weaknesses:

      The authors acknowledge several limitations of this study.

      The correlations with HP-status are based on a small number of participants per Hp category (N=9 with current Hp+; N=6 for ex-HP+ and non-HP), and would benefit from further validation to establish reproducibility in other cohorts.

      The ligand-receptor cross-talk analysis and the suggestion that suppressive T cells could interact with the malignant epithelium through TIGIT-NECTIN2/PVR pairs, are preliminary findings based on transcriptomic analysis and immunostaining and will require further validation.

    1. eLife assessment

      This study provides a useful inventory of genes that are up- or down-regulated during the early metamorphic development of male and female larvae and proposes that the microRNA cluster miR-277/34 is involved in the development of sexual differences during early metamorphosis of Drosophila melanogaster, although its precise role remains unclear. The strength of evidence, based on a combination of diverse methods including mRNA and small RNA sequencing, in silico analyses, in vitro assays, and loss-of-function experiments, is incomplete as it lacks a general model and an examination of the potential effects of the miR-277/34 mutations on phenotypes such as morphology or developmental time. This work will be of interest to developmental biologists interested in sexual dimorphism and in the interplay between hormones and microRNAs during development.

    2. Reviewer #1 (Public Review):

      Summary:

      In this paper, Li and colleagues have found mircoRNAs that affect levels of metamorphosis-regulating genes that can also affect levels of sesquiterpenoids (juvenile hormone and related compounds) and ecdysteriods, which regulate the timing and stages of insects, respectively. They first compared the transcriptomes of Drosophila at the third larval instar and at the white pre-pupa stage. They found thousands of differences in gene transcript levels between males and females, and between the two different stages. Among those genes that were differentially regulated they saw that genes involved in insect hormone biosynthesis were disproportionately represented. Many of the differentially regulated genes were involved in the insect hormone biosynthesis pathway and ascorbate and alderete metabolism. MicroRNAs were also differentially expressed during metamorphosis and were separately identified. The authors then considered genes and whether the differentially expressed microRNAs might regulate transcripts known to be involved in sesquiterpenoid production. In silico analysis of microRNAs predicted a list of 17 microRNAs that can regulate transcripts of sesquiterpenoid biosynthesis genes. The authors then used an in vitro luciferase assay to validate the binding and downregulation of 10 of the microRNAs to genes involved with sesquiterpenoid production in S2 cells.

      Li and colleagues then focus on two genes they found were bound by microRNAs that have established roles in metamorphosis. The microRNAs miR-34 and miR-277 bind transcripts of two protein-coding genes that regulate metamorphosis Kr-h1, which encodes a transcription factor that is a JH-inducible transcription factor, and Allatostatin C Receptor 1, (AstC-R1), a G-protein coupled receptor that regulates the corpora allatum, the gland that produces sesquiterpenoids. Using a LAMP assay, one of the microRNAs, miR-277 was shown to bind to both AstC-R1 and Kr-h1 in in vivo whole-animal extracts. There is no mention of binding between either protein-coding transcript and the miR-34 microRNA. Temporal expression of all four transcripts shows that their abundance is anti-correlated; stages of high miR-34 or miR-277 expression correlate with low AstC-R1 or Kr-h1 expression. Homozygous deletions of both mircroRNAs result in 23% lethality, five days after adult eclosion. The authors also generated specific mutants in miR-34 or miR-277 and find differences in the expression of AstC-R1 and Kr-h1 and sex-specific differences in both sesquiterpenoids and ecdysteroids in the knock-out lines. If there were phenotypes associated with the specific knock-outs, those were not mentioned. Next, the authors examined the transcriptomes of the miR-3277 and miR-34 mutants and found several other GO-terms enriched among the differentially expressed genes. However, the sesquiterpenoid pathway and ascorbate and alderete metabolism are not listed.

      Strengths:

      This is an interesting manuscript that could make an important contribution to our understanding of the roles of micro RNAs at metamorphosis, and potentially of how sex-specific differences arise during metamorphosis. Strengths of the paper include the functional validation of microRNA binding, in vitro and in vivo-, as well as the characterization of sesquiterpenoid and ecdysteroid titers. The authors have also used CRISPR to generate specific knock-outs of miR-34 and miR-277. The transcriptomes will be a resource for future work to mine for differences in gene expression during metamorphosis.

      Weaknesses:

      (1) Spatial Expression of miR-34 and miR-277. If miR-34 and miR-277 regulate AstC-R1 and Kr-h1, then they must be expressed in the same cells. Although the authors show that the microRNAs do bind to the transcripts of AstC-R1 and Kr-h1 in S2 cells, and miR-277 binds AstC-R1 and Kr-h1 in vivo whole-animal homogenates, we do not know if the microRNAs are ever in the cells where AstC-R1 or Kr-h1 are expressed. AstC-R1 is only expressed in a few cells in the brain, so it is not at all certain that it is co-expressed with either microRNA. The creation of enhancer lines or in situ hybridization in Drosophila is straightforward and would sort this out.

      (2) Phenotypes. Although a double deletion was used and specific knock-outs of both miR-34 and miR-277 were generated, the analysis of the mutants is very superficial. For the homozygous deletion of both microRNAs miR-34 and miR-277, only a decrease in survivorship was observed a full six days after adult eclosion - after the end of metamorphosis. No phenotype for either miR-34KO or miR-277-KO was given. The authors cite the work of others who have found specific phenotypes after manipulation of sesquiterpenoids or ecdysteroids, like Riddiford and Ashburner, but do not use any of these many studies to help them characterize the phenotype. If the loss of miR-34 and miR-277 affects so many pathways (including MAPK signaling, TGF-beta signaling, FoxO signaling, and Wnt signaling), as well as global titers of metamorphic hormones, then there shouldn't there be something different in the development to discuss?

      (3) I think the reliance on GO term enrichment is getting in the way of biology. For instance, I would not describe Kr-h1 as a sesquiterpenoid biosynthesis pathway gene. Yet the authors say they were motivated to examine microRNA regulation of Kr-h1 because they saw differences in levels of the sesquiterpenoid biosynthesis pathway between WL3 and WPP, a period which also saw differences in expression of some microRNAs. I understand that Kr-h1 expression is regulated by JH, a sesquiterpenoid, but it is not directly involved with JH production, so relying on GO term enrichment has made the decision to focus on Kr-h1 feel arbitrary.

      (4) The transcriptomes of miR-34 and miR-277 should have revealed genes encoding members of the sesquiterpenoid biosynthesis pathway as well as AstC-R1 and Kr-h1, but neither was mentioned. The functional tests of miR-34 and miR-277 were performed because they were shown to affect the levels of expression of genes in the sesquiterpenoid biosynthesis pathway. Figure 2 shows a significant decrease in AstC-R1 and Kr-h1 transcripts after the loss of miR-34 and miR-277. However, the results do not mention either (Lines 250-264). Instead, there is a list of 10 different GO terms (like arginine and proline metabolism or fatty acid degradation) that were enriched in miR-34 and miR-277 transcriptomes. If any of those ten types have any relationship to Kr-h1, AstC-R1, or metamorphosis, that has not been explained.

      (5) Not enough care was taken in describing the stages. The methods describe wandering larvae (WL3) and white pre-pupa (WPP) for the transcriptomes, but in the text, different terms are used, like "larva", "pupa" and "L3 larvae instars" "early pupae" "late L3". Also, it seems like the small RNA libraries for sequencing were taken from "L3 larvae", but the stage of the L3 larvae was not mentioned. Staging is important, especially during metamorphosis, since differences in expression are expected to exist between different stages of L3, between early vs late wandering, and between WPP and early pupal stages.

    3. Reviewer #2 (Public Review):

      Summary:

      This study proposes that the microRNA cluster miR-277/34 controls the generation of sexual dimorphism in Drosophila melanogaster during metamorphosis by acting on specific hormonal and developmental gene pathways.

      Strengths:

      Using a combination of mRNA and small RNA sequencing together with genome-wide in silico and in vitro analyses the authors identified a microRNA cluster that may be involved in metamorphosis and the generation of sexual dimorphism in Drosophila melanogaster.

      Weaknesses:

      Biological validation of the identified sexually dimorphic genes and a detailed understanding of how the microRNA cluster miR-277/34 might be involved in the regulation of sesquiterpenoids are needed.

      Major suggestions:

      (1) If AstC-R1 and Kr-h1 are targets of the miR-277/34 cluster and cause their downregulation, it is not clear why there would also be a decrease in the levels of these genes in the miR-277/34 mutants. This would suggest that the mechanism is not straightforward and that further epistatic experiments should be carried out in order to clarify this issue.

      (2) The changes in the expression levels of AstC-R1 in pupae of miR-277-KO and mir-34-KO flies must be accompanied by photos of the respective larvae and pupae, as well as an analysis of the larvae-pupa transition on the mutants by gender.

      (3) Biological validation of the identified sexually dimorphic genes in vivo will be necessary for the support of this work.

    4. Reviewer #3 (Public Review):

      Summary:

      The authors show convincingly the complexity of gene up- and down-regulation at the outset of metamorphosis and identify substantial differences between the two sexes, even at this early time in development. The complexity of microRNA expression and the difference between the sexes are also nicely laid out. The functional significance of these differences, though, is harder to establish. The authors have focused on the roles of two families of developmental hormones, the ecdysteroids, and the juvenile hormones. The emergence of sex-specific differentiation of organs during metamorphosis is clearly downstream of the action of ecdysteroids and/or JH, but there is no evidence that the presence or lack of these hormones has any effect on the sexual identity of organ systems - i.e., that manipulations of JH or ecdysteroid result in either the masculinization or feminization of individuals or their organs. The precedence for the linkage of these hormones to sex determination is the 2002, Belgacem & Martin study, which describes the effects of JH on fly locomotion. These authors show that the number of stop/start bouts is sexually dimorphic, and removal of JH in males shifts their frequency into the female range while giving treated males exogenous JH moves it back. While this is referred to as a "feminization" of male behavior, this quantitative shift in frequency is not as compelling as would be a qualitative shift -- for example, the removal of JH causing males to show egg-laying behavior (a result that has never been seen). Also, these effects are in a fully mature system, rather than at the early metamorphic time examined in the present paper. In driving and coordination metamorphosis, JH and ecdysteroids are intimately involved in sexual differentiation, but I know of no compelling evidence that they play a role in sex determination.

      While the summary of the effects or removal of specific microRNAs on the components of the biosynthetic pathway for the JHs and ecdysteroids (Figure 2E ,F) is quite compelling, I am concerned about the effects of the removal of mir-277 and mir-34 on the levels of both the JHs and 20E. My concern centers around the data from the control group (w[1118] animals in Figure 2D). These data are the first report of a marked sex difference in the titer of either JH or ecdysteroid at the start of metamorphosis in Drosophila. As expected, males show a 10-20 increase in levels of JH III, JHB3, and 20E between the L3 stage and the white puparium, but, surprisingly, the levels of these hormones in female L3 larvae are equal to or greater than that seen at pupariation! These data for females run counter to over 50 years of work on the effects of ecdysteroids in Drosophila!

      As far as I can gather from the paper, the L3 data were obtained using wandering larvae. This stage lasts for about 12 hours and ends with pupariation. Larvae from this period need to be used with caution for hormone studies. Levels of both JH and ecdysteroid are low as larvae leave the food but rapidly rise to their peak levels at the white puparium stage 12 hours later. To deal with the rapidly changing hormonal landscape through this period, the researchers have used physiological markers to track this progression. Initially, it was the sage of "puffing" of the giant salivary gland chromosomes, but, for bulk collection of staged larvae, larvae are fed on food containing a blue dye, and progression is tracked by the loss of blue coloring from the gut. I could not find if the authors had any criteria for selecting larvae during the wandering period. Male and female larvae grow to different sizes. Might this difference in growth be biased when larvae were selected during their wandering phase?

      The other hormone-related issue is the expression of Kr-h1 during larval stages and metamorphosis (Figure 1G). Kr-h1 is the main target of JH and Kr-h1 expression is often used as a proxy for the JH titer. The authors report that peak Kr-h1 expression occurs in the L3 (when the JH titer should be lowest!) and that it drops at wandering. This pattern is counter to that reported in the literature (e.g., FlyBase, ModEncode).

    1. eLife assessment

      This potentially valuable study reports new and unexpected roles of STAG3 in regulating exit from pluripotency in mouse embryonic stem cells (mESCs). However, the evidence for the proposed role of STAG3 in the post-transcriptional regulation of gene expression is viewed as yet incomplete. The work will be of interest to colleagues studying stem cells, early steps in differentiation, and gene regulation.

    2. Reviewer #1 (Public Review):

      The paper titled "STAG3 promotes exit from pluripotency through post-transcriptional mRNA regulation in the cytoplasm" suggests a new and unexpected role for STAG3, a protein traditionally associated with the cohesin complex during meiosis, in regulating the exit from pluripotency in mouse embryonic stem cells (mESCs). While STAG3 is traditionally studied for its role in meiosis, this paper reveals that STAG3 is expressed in mouse embryonic stem cells (mESCs) and primordial germ cell-like cells (PGCLCs) and may be necessary for PGCLC-like specification and exit from pluripotency. In ESCs, the study reports that STAG3 is found in the cytoplasm, where it interacts with various RNA-binding proteins (RBPs) and localizes to centrosomes. Knockdown of STAG3 disrupts centrosome stability and RNA-induced silencing complex (RISC) components, leading to the misregulation of mRNAs such as DPPA3, Nanog, and TNRC6C. In summary, this study expands the known functions of STAG3 beyond cohesin, highlighting a potential role in cytoplasmic post-transcriptional regulation.

      The authors perform a comprehensive characterization of RNA and protein changes in ESCs and differentiated cells upon loss of STAG3, providing preliminary and intriguing insights. However, there are several aspects that require further exploration:

      (1) A rescue experiment for the STAG3 RNAi is missing, making it unclear whether the observed effects are indeed due to the knockdown of STAG3.

      (2) While the paper identifies several interactions and effects of STAG3, it lacks detailed mechanistic insights into how STAG3 regulates specific mRNAs and proteins. Specifically, it is unclear which proteins directly interact with STAG3 or recruit STAG3 to RNP complexes. AlphaFold may help in this analysis.

      (3) It is unclear whether this is an alternative STAG3 isoform or if STAG3 is modified. What dictates its interaction with cohesin versus RNPs?

      (5) Are there unique features or sequence barcodes present on the misregulated RNAs?

      (6) Does STAG3 associate with a single type of RNP or is it present in all types?

    3. Reviewer #2 (Public Review):

      Summary:

      This manuscript addresses the intriguing topic of the potential roles of germline-specific proteins in early development. While this issue is quite interesting and generally under-explored, the work falls short of making truly tangible inroads.

      Strengths:

      The strength of the study is in new proteomic datasets.

      Weaknesses:

      The manuscript makes some strong statements, beginning with the title "STAG3 (1) promotes exit from pluripotency (2) through post-transcriptional mRNA regulation in the cytoplasm".

      Upon reviewing the data it appears that neither (1) or (2) here have strong foundations based on experiments presented. While intriguing, the experimental evidence is still rather inconclusive.

      The potential involvement of STAG3 in PGC specification is the most intriguing aspect of this study. Unfortunately, it is not going far enough to derive a fully meaningful biological conclusion. In fact, DPPPA3-GFP, PRDM-GFP, and PGC marker expression results are contradictory and do not form a coherent picture of the biological effect of STAG3 depletion. No effect of the knock-down in PGC specification when PRDM is scored (line 167) is particularly worrisome. As for finding a cytoplasmic role of STAG3, the data also remain inconclusive.

    1. eLife assessment

      In this important study, the authors employed three types of theoretical/computational models (coarse-grained molecular dynamics, analytical theory and field-theoretical simulations) to analyze the impact of salt on protein liquid-liquid phase separation. These different models reinforce each other and together provide convincing evidence to explain distinct salt effects on ATP mediated phase separation of different variants of caprin1. The insights and general approach are broadly applicable to the analysis of protein phase separation.

    2. Reviewer #1 (Public Review):

      Summary:<br /> The authors used multiple approaches to study salt effects in liquid-liquid phase separation (LLPS). Results on both wild-type Caprin1 and mutants and on different types of salts contribute to a comprehensive understanding.

      Strengths:<br /> The main strength of this work is the thoroughness of investigation. This aspect is highlighted by the multiple approaches used in the study, and reinforced by the multiple protein variants and different salts studied.

      Weaknesses:<br /> (1) The multiple computational approaches are a strength, but they're cruder than explicit-solvent all-atom molecular dynamics (MD) simulations and may miss subtle effects of salts. In particular, all-atom MD simulations demonstrate that high salt strengthens pi-types of interactions (ref. 42 and MacAinsh et al, https://www.biorxiv.org/content/10.1101/2024.05.26.596000v3).

      (2) The paper can be improved by distilling the various results into a simple set of conclusions. By example, based on salt effects revealed by all-atom MD simulations, MacAinsh et al. presented a sequence-based predictor for classes of salt dependence. Wild-type Caprin1 fits right into the "high net charge" class, with a high net charge and a high aromatic content, showing no LLPS at 0 NaCl and an increasing tendency of LLPS with increasing NaCl. In contrast, pY-Caprin1 belongs to the "screening" class, with a high level of charged residues and showing a decreasing tendency of LLPS.

      (3) Mechanistic interpretations can be further simplified or clarified. (i) Reentrant salt effects (e.g., Fig. 4a) are reported but no simple explanation seems to have been provided. Fig. 4a,b look very similar to what has been reported as strong-attraction promotor and weak-attraction suppressor, respectively (ref. 50; see also PMC5928213 Fig. 2d,b). According to the latter two studies, the "reentrant" behavior of a strong-attraction promotor, CL- in the present case, is due to Cl-mediated attraction at low to medium [NaCl] and repulsion between Cl- ions at high salt. Do the authors agree with this explanation? If not, could they provide another simple physical explanation? (ii) The authors attributed the promotional effect of Cl- to counterion-bridged interchain contacts, based on a single instance. There is another simple explanation, i.e., neutralization of the net charge on Caprin1. The authors should analyze their simulation results to distinguish net charge neutralization and interchain bridging; see MacAinsh et al.

      (4) The authors presented ATP-Mg both as a single ion and as two separate ions; there is no explanation of which of the two versions reflects reality. When presenting ATP-Mg as a single ion, it's as though it forms a salt with Na+. I assume NaCl, ATP, and MgCl2 were used in the experiment. Why is Cl- not considered? Related to this point, it looks like ATP is just another salt ion studied and much of the Results section is on NaCl, so the emphasis of ATP ("Diverse Roles of ATP" in the title is somewhat misleading.

    3. Reviewer #2 (Public Review):

      Summary:<br /> In this paper, Lin and colleagues aim to understand the role of different salts on the phase behavior of a model protein of significant biological interest, Caprin1, and its phosphorylated variant, pY-Caprin1. To achieve this, the authors employed a variety of methods to complement experimental studies and obtain a molecular-level understanding of ion partitioning inside biomolecular condensates. A simple theory based on rG-RPA is shown to capture the different salt dependencies of Caprin1 and pY-Caprin1 phase separation, demonstrating excellent agreement with experimental results. The application of this theory to multivalent ions reveals many interesting features with the help of multicomponent phase diagrams. Additionally, the use of CG model-based MD simulations and FTS provides further clarity on how counterions can stabilize condensed phases.

      Strengths:<br /> The greatest strength of this study lies in the integration of various methods to obtain complementary information on thermodynamic phase diagrams and the molecular details of the phase separation process. The authors have also extended their previously proposed theoretical approaches, which should be of significant interest to other researchers. Some of the findings reported in this paper, such as bridging interactions, are likely to inspire new studies using higher-resolution atomistic MD simulations.

      Weaknesses:<br /> The paper does not have any major issues.

    4. Reviewer #3 (Public Review):

      Authors first use rG-RPA to reproduce two observed trends. Caprin1 does not phase separate at very low salt but then undergoes LLPS with added salt while further addition of salt reduces its propensity to LLPS. On the other hand pY-Caprin1 exhibits a monotonic trend where the propensity to phase separate decreases with the addition of salt. This distinction is captured by a two component model and also when salt ions are explicitly modeled as a separate species with a ternary phase diagram. The predicted ternary diagrams (when co and counter ions are explicitly accounted for) also predict the tendency of ions to co-condense or exclude proteins in the dense phase. Predicted trends are generally in line with the measurement for Cparin1. Next, the authors seek to explain the observed difference in phase separation when Arginines are replaced by Lysines creating different variants. In the current rG-RPA type models both Arginine (R) and Lysine (K) are treated equally since non-electrostatic effects are only modeled in a mean-field manner that can be fitted but not predicted. For this reason, coarse grain MD simulation is suitable. Moreover, MD simulation affords structural features of the condensates. They used a force field that is capable of discriminating R and K. The MD predicted degrees of LLPS of these variants again is consistent with the measurement. One additional insight emerges from MD simulations that a negative ion can form a bridge between two positively charged residues on the chain. These insights are not possible to derive from rG-RPA. Both rG-RPA and MD simulation become cumbersome when considering multiple types of ions such as Na, Cl, [ATP] and [ATP-Mg] all present at the same time. FTS is well suited to handle this complexity. FTS also provides insights into the co-localization of ions and proteins that is consistent with NMR. By using different combinations of ions they confirm the robustness of the prediction that Caprin1 shows salt-dependent reentrant behavior, adding further support that the differential behavior of Caprin1, and pY-Caprin1 is likely to be mediated by charge-charge interactions.

    1. eLife assessment

      This study provides a useful analysis of the changes in chromatin organization and gene expression that occur during the differentiation of two cell types (anterior endoderm and prechordal plate) from a common progenitor in zebrafish. Although the findings are consistent with previous work, the evidence presented in the study appears to be incomplete and would benefit from more rigorous interpretation of single-cell data, more in-depth lineage tracing, overexpression experiments with physiological levels of Ripply, and a clearer justification for using an explant system. With these modifications, this paper will be of interest to zebrafish developmental biologists investigating mechanisms underlying differentiation.

    2. Reviewer #1 (Public Review):

      Summary:

      During vertebrate gastrulation, mesendoderm cells are initially specified by morphogens (e.g. Nodal) and segregate into endoderm and mesoderm in part based on Nodal concentrations. Using zebrafish genetics, live imaging, and single-cell multi-omics, the manuscript by Cheng et al presents evidence to support a claim that anterior endoderm progenitors derive primarily from prechordal plate progenitors, with transcriptional regulators goosecoid (Gsc) and ripply1 playing key roles in this cell fate determination. Such a finding would represent a significant advance in our understanding of how anterior endoderm is specified in vertebrate embryos.

      Strengths:

      Live imaging-based tracking of PP and endo reporters (Figure 2) is well executed and convincing, though a larger number of individual cell tracks will be needed. Currently, only a single cell track (n=1) is provided.

      Weaknesses:

      (1) The central claim of the paper - that the anterior endoderm progenitors arise directly from prechordal plate progenitors - is not adequately supported by the evidence presented. This is a claim about cell lineage, which the authors are attempting to support with data from single-cell profiling and genetic manipulations in embryos and explants. The construction of gene expression (pseudo-time) trajectories, while a modern and powerful approach for hypothesis generation, should not be used as a substitute for bona fide lineage tracing methods. If the authors' central hypothesis is correct, a CRE-based lineage tracing experiment (e.g. driving CRE using a PP marker such as Gsc) should be able to label PP progenitor cells that ultimately contribute to anterior endoderm-derived tissues. Such an experiment would also allow the authors to quantify the relative contribution of PP (vs non-PP) cells to the anterior endoderm, which is not possible to estimate from the indirect data currently provided. Note: while the present version of the manuscript does describe a sox17:CRE lineage tracing experiment, this actually goes in the opposite direction that would be informative (sox:17:CRE-marked descendants will be a mixture of PP-derived and non-PP derived cells, and the Gsc-based reporter does not allow for long-term tracking the fates of these cells).

      (2) The authors' descriptions of gene expression patterns in the single-cell trajectory analyses do not always match the data. For example, it is stated that goosecoid expression marks progenitor cells that exist prior to a PP vs endo fate bifurcation (e.g. lines 124-130). Yet, in Figure 1C it appears that in fact goosecoid expression largely does not precede (but actually follows) the split and is predominantly expressed in cells that have already been specified into the PP branch. Likewise, most of the cells in the endo branch (or prior) appear to never express Gsc. While these trends do indeed appear to be more muddled in the explant data (Figure 1H), it still seems quite far-fetched to claim that Gsc expression is a hallmark of endoderm-PP progenitors.

      (3) The study seems to refer to "endoderm" and "anterior endoderm" somewhat interchangeably, and this is potentially problematic. Most single-cell-based analyses appearing in the study rely on global endoderm markers (sox17, sox32) which are expressed in endodermal precursors along the entire ventrolateral margin. Some of these cells are adjacent to the prechordal plate on the dorsal side of the gastrula, but many (most in fact) are quite some distance away. The microscopy-based evidence presented in Figure 2 and elsewhere, however, focuses on a small number of sox17-expressing cells that are directly adjacent to, or intermingled with, the prechordal plate. It, therefore, seems problematic for the authors to generalize potential overlaps with the PP lineage to the entire endoderm, which includes cells in ventral locations. It would be helpful if the authors could search for additional markers that might stratify and/or mark the anterior endoderm and perform their trajectory analysis specifically on these cells.

      (4) It is not clear that the use of the nodal explant system is allowing for rigorous assessment of endoderm specification. Why are the numbers of endoderm cells so vanishingly few in the nodal explant experiments (Figure 1H, 3H), especially when compared to the embryo itself (e.g. Figures 1C-D)? It seems difficult to perform a rigorous analysis of endoderm specification using this particular model which seems inherently more biased towards PP vs. endoderm than the embryo itself. Why not simply perform nodal pathway manipulations in embryos?

      (5) The authors should not claim that proximity in UMAP space is an indication of transcriptional similarity (lines 207-208), especially for well-separated clusters. This is a serious misrepresentation of the proper usage of the UMAP algorithm. The authors make a similar claim later on (lines 272-274).

    3. Reviewer #2 (Public Review):

      Summary:

      During vertebrate gastrulation, the mesoderm and endoderm arise from a common population of precursor cells and are specified by similar signaling events, raising questions as to how these two germ layers are distinguished. Here, Cheng and colleagues use zebrafish gastrulation as a model for mesoderm and endoderm segregation. By reanalyzing published single-cell sequencing data, they identify a common progenitor population for the anterior endoderm and the mesodermal prechordal plate (PP). They find that expression levels of PP genes Gsc and ripply are among the earliest differences between these populations and that their increased expression suppresses the expression of endoderm markers. Further analysis of chromatin accessibility and Ripply cut-and-tag is consistent with direct repression of endoderm by this PP marker. This study demonstrates the roles of Gsc and Ripply in suppressing anterior endoderm fate, but this role for Gsc was already known and the effect of Ripply is limited to a small population of anterior endoderm. The manuscript also focuses extensively on the function of Nodal in specifying and patterning the mesoderm and endoderm, a role that is already well known and to which the current analysis adds little new insight.

      Strengths:

      Integrated single-cell ATAC- and RNA-seq convincingly demonstrate changes in chromatin accessibility that may underlie the segregation of mesoderm and endoderm lineages, including Gsc and ripply. Identification of Ripply-occupied genomic regions augments this analysis. The genetic mutants for both genes provide strong evidence for their function in anterior mesendoderm development, although these phenotypes are subtle.

      Weaknesses:

      The use of zebrafish embryonic explants for cell fate trajectory analysis (rather than intact embryos) is not justified. In both transcriptomic comparisons between the two fate trajectories of interest and Ripply cut-and-tag analysis, the authors rely too heavily on gene ontology which adds little to our functional understanding. Much of the work is focused on the role of Nodal in the mesoderm/endoderm fate decision, but the results largely confirm previous studies and again provide few new insights. Some experiments were designed to test the relationship between the mesoderm and endoderm lineages and the role of epigenetic regulators therein, but these experiments were not properly controlled and therefore difficult to interpret.

    4. Reviewer #3 (Public Review):

      Summary:

      Cheng, Liu, Dong, et al. demonstrate that anterior endoderm cells can arise from prechordal plate progenitors, which is suggested by pseudo time reanalysis of published scRNAseq data, pseudo time analysis of new scRNAseq data generated from Nodal-stimulated explants, live imaging from sox17:DsRed and Gsc:eGFP transgenics, fluorescent in situ hybridization, and a Cre/Lox system. Early fate mapping studies already suggested that progenitors at the dorsal margin give rise to both of these cell types (Warga) and live imaging from the Heisenberg lab (Sako 2016, Barone 2017) also pretty convincingly showed this. However, the data presented for this point are very nice, and the additional experiments in this manuscript, however, further cement this result. Though better demonstrated by previous work (Alexander 1999, Gritsman 1999, Gritsman 2000, Sako 2016, Rogers 2017, others), the manuscript suggests that high Nodal signaling is required for both cell types, and shows preliminary data that suggests that FGF signaling may also be important in their segregation. The manuscript also presents new single-cell RNAseq data from Nodal-stimulated explants with increased (lft1 KO) or decreased (ndr1 KD) Nodal signaling and multi-omic ATAC+scRNAseq data from wild-type 6 hpf embryos but draws relatively few conclusions from these data. Lastly, the manuscript presents data that SWI/SNF remodelers and Ripply1 may be involved in the anterior endoderm - prechordal plate decision, but these data are less convincing. The SWI/SNF remodeler experiments are unconvincing because the demonstration that these factors are differentially expressed or active between the two cell types is weak. The Ripply1 gain-of-function experiments are unconvincing because they are based on incredibly high overexpression of ripply1 (500 pg or 1000 pg) that generates a phenotype that is not in line with previously demonstrated overexpression studies (with phenotypes from 10-20x lower expression). Similarly, the cut-and-tag data seems low quality and like it doesn't support direct binding of ripply1 to these loci.

      In the end, this study provides new details that are likely important in the cell fate decision between the prechordal plate and anterior endoderm; however, it is unclear how Nodal signaling, FGF signaling, and elements of the gene regulatory network (including Gsc, possibly ripply1, and other factors) interact to make the decision. I suggest that this manuscript is of most interest to Nodal signaling or zebrafish germ layer patterning afficionados. While it provides new datasets and observations, it does not weave these into a convincing story to provide a major advance in our understanding of the specification of these cell types.

      Major issues:

      (1) UMAPs: There are several instances in the manuscript where UMAPs are used incorrectly as support for statements about how transcriptionally similar two populations are. UMAP is a stochastic, non-linear projection for visualization - distances in UMAP cannot be used to determine how transcriptionally similar or dissimilar two groups are. In order to make conclusions about how transcriptionally similar two populations are requires performing calculations either in the gene expression space, or in a linear dimensional reduction space (e.g. PCA, keeping in mind that this will only consider the subset of genes used as input into the PCA). Please correct or remove these instances, which include (but are not limited to):<br /> p.4 107-110<br /> p.4 112<br /> p.8 207-208<br /> p.10 273-275

      (2) Nodal and lefty manipulations: The section "Nodal-Lefty regulatory loop is needed for PP and anterior Endo fate specification" and Figure 3 do not draw any significant conclusions. This section presents a LIANA analysis to determine the signals that might be important between prechordal plate and endoderm, but despite the fact that it suggests that BMP, Nodal, FGF, and Wnt signaling might be important, the manuscript just concludes that Nodal signaling is important. Perhaps this is because the conclusion that Nodal signaling is required for the specification of these cell types has been demonstrated in zebrafish in several other studies with more convincing experiments (Alexander 1999, Gritsman 1999, Gritsman 2000, Rogers 2017, Sako 2016). While FGF has recently been demonstrated to be a key player in the stochastic decision to adopt endodermal fate in lateral endoderm (Economou 2022), the idea that FGF signaling may be a key player in the differentiation of these two cell types has strangely been relegated to the discussion and supplement. Lastly, the manuscript does not make clear the advantage of performing experiments to explore the PP-Endo decision in Nodal-stimulated explants compared to data from intact embryos. What would be learned from this and not from an embryo? Since Nodal signaling stimulates the expression of Wnts and FGFs, these data do not test Nodal signaling independent of the other pathways. It is unclear why this artificial system that has some disadvantages is used since the manuscript does not make clear any advantages that it might have had.

      (3) ripply1 mRNA injection phenotype inconsistent with previous literature: The phenotype presented in this manuscript from overexpressing ripply1 mRNA (Fig S11) is inconsistent with previous observations. This study shows a much more dramatic phenotype, suggesting that the overexpression may be to a non-physiological level that makes it difficult to interpret the gain-of-function experiments. For instance, Kawamura et al 2005 perform this experiment but do not trigger loss of head and eye structures or loss of tail structures. Similarly, Kawamura et al 2008 repeat the experiment, triggering a mildly more dramatic shortening of the tail and complete removal of the notochord, but again no disturbance of head structures as displayed here. These previous studies injected 25 - 100 pg of ripply1 mRNA with dramatic phenotypes, whereas this study uses 500 - 1000 pg. The phenotype is so much more dramatic than previously presented that it suggests that the level of ripply1 overexpression is sufficiently high that it may no longer be regulating only its endogenous targets, making the results drawn from ripply1 overexpression difficult to trust.

      (4) Ripply1 binding to sox17 and sox32 regulatory regions not convincing: The Cut and Tag data presented in Fig 6J-K does not seem to be high quality and does not seem to provide strong support that Ripply 1 binds to the regulatory regions of these genes. The signal-to-noise ratio is very poor, and the 'binding' near sox17 that is identified seems to be even coverage over a 14 kb region, which is not consistent with site-specific recruitment of this factor, and the 'peaks' highlighted with yellow boxes do not appear to be peaks at all. To me, it seems this probably represents either: (1) overtagmentation of these samples or (2) an overexpression artifact from injection of too high concentration of ripply1-HA mRNA. In general, Cut and Tag is only recommended for histone modifications, and Cut and Run would be recommended for transcriptional regulators like these (see Epicypher's literature). Given this and the previous point about Ripply1 overexpression, I am not convinced that Ripply1 regulates endodermal genes. The existing data could be made somewhat more convincing by showing the tracks for other genes as positive and negative controls, given that Ripply1 has known muscle targets (how does its binding look at those targets in comparison) and there should be a number of Nodal target genes that Ripply1 does not bind to that could be used as negative controls. Overall this experiment doesn't seem to be of high enough quality to drive the conclusion that Ripply1 directly binds near sox17 and sox32 and from the data presented in the manuscript looks as if it failed technically.

      (5) "Cooperatively Gsc and ripply1 regulate": I suggest avoiding the term "cooperative," when describing the relationship between Ripply1 and Gsc regulation of PP and anterior endoderm - it evokes the concept of cooperative gene regulation, which implies that these factors interact with each biochemically in order to bind to the DNA. This is not supported by the data in this manuscript, and is especially confusing since Ripply1 is thought to require cooperative binding with a T-box family transcription factor to direct its binding to the DNA.

      (6) SWI/SNF: The differential expression of srcap doesn't seem very remarkable. The dot plots in the supplement S7H don't help - they seem to show no expression at all in the endoderm, which is clearly a distortion of the data, since from the violin plots it's obviously expressed and the dot-size scale only ranges from ~30-38%. Please add to the figure information about fold-change and p-value for the differential expression. Publicly available scRNAseq databases show scrap is expressed throughout the entire early embryo, suggesting that it would be surprising for it to have differential activity in these two cell types and thereby contribute to their separate specification during development. It seems equally possible that this just mildly influences the level of Nodal or FGF signaling, which would create this effect.

      The multiome data seems like a valuable data set for researchers interested in this stage of zebrafish development. However, the presentation of the data doesn't make many conclusions, aside from identifying an element adjacent to ripply1 whose chromatin is open in prechordal plate cells and not endodermal cells and showing that there are a number of loci with differential accessibility between these cell types. That seems fairly expected since both cell types have several differentially expressed transcriptional regulators (for instance, ripply1 has previously been demonstrated in multiple studies to be specific to the prechordal plate during blastula stages). The manuscript implies that SWI/SNF remodeling by Srcap is responsible for the chromatin accessibility differences between these cell types, but that has not actually been tested. It seems more likely that the differences in chromatin accessibility observed are a result of transcription factors binding downstream of Nodal signaling.

      Minor issues:

      Figure 2 E-F: It's not clear which cells from E are quantitated in F. For instance, the dorsal forerunner cells are likely to behave very differently from other endodermal progenitors in this assay. It would be helpful to indicate which cells are analyzed in Fig F with an outline or other indicator of some kind. Or - if both DFCs and endodermal cells are included in F, to perhaps use different colors for their points to help indicate if their fluorescence changes differently.

      Fig 3 J: Should the reference be Dubrulle et al 2015, rather than Julien et al?

      References:<br /> Alexander, J. & Stainier, D. Y. A molecular pathway leading to endoderm formation in zebrafish. Current biology : CB 9, 1147-1157 (1999).<br /> Barone, V. et al. An Effective Feedback Loop between Cell-Cell Contact Duration and Morphogen Signaling Determines Cell Fate. Dev. Cell 43, 198-211.e12 (2017).<br /> Economou, A. D., Guglielmi, L., East, P. & Hill, C. S. Nodal signaling establishes a competency window for stochastic cell fate switching. Dev. Cell 57, 2604-2622.e5 (2022).<br /> Gritsman, K. et al. The EGF-CFC protein one-eyed pinhead is essential for nodal signaling. Cell 97, 121-132 (1999).<br /> Gritsman, K., Talbot, W. S. & Schier, A. F. Nodal signaling patterns the organizer. Development (Cambridge, England) 127, 921-932 (2000).<br /> Kawamura, A. et al. Groucho-associated transcriptional repressor ripply1 is required for proper transition from the presomitic mesoderm to somites. Developmental cell 9, 735-744 (2005).<br /> Kawamura, A., Koshida, S. & Takada, S. Activator-to-repressor conversion of T-box transcription factors by the Ripply family of Groucho/TLE-associated mediators. Molecular and cellular biology 28, 3236-3244 (2008).<br /> Sako, K. et al. Optogenetic Control of Nodal Signaling Reveals a Temporal Pattern of Nodal Signaling Regulating Cell Fate Specification during Gastrulation. Cell Rep. 16, 866-877 (2016).<br /> Rogers, K. W. et al. Nodal patterning without Lefty inhibitory feedback is functional but fragile. eLife 6, e28785 (2017).<br /> Warga, R. M. & Nüsslein-Volhard, C. Origin and development of the zebrafish endoderm. Development 126, 827-838 (1999).

    1. eLife assessment

      This fundamental work advances our understanding of the regulation of corneal stem cell fate and differentiation, identifying Sox9 as a player in this process. The evidence supporting the conclusions is compelling, with rigorous genomic experiments and genetic mouse models that are state-of-the-art in the field. The work will be of broad interest to developmental, stem cell, and transcriptional biologists.

    2. Reviewer #1 (Public Review):

      Summary:

      The authors wanted to identify genes that are critical for regulating the asymmetric fates of limbal stem cells and their transit amplified progeny in the central cornea. To this end, they utilized an in vivo cell cycle reporter to isolate proliferating basal cells from the anterior ocular surface epithelium and performed single-cell RNA-seq. This strategy revealed distinct basal cell identities with unique expression profiles of structural genes and transcription factors. The authors then focused on the Sox9 transcription factor implicated in stem cell regulation. It was differentially expressed between limbal stem cells and their progeny in the central cornea. Lineage tracing analysis confirmed that Sox9 marks long-lived limbal stem cells. Conditional deletion of Sox9 led to abnormal differentiation and squamous metaplasia in the central cornea. The authors suggest that Sox9 is required for the switch to asymmetric fate and commitment toward differentiation, as transit cells exit the limbal niche. By inhibiting the terminal differentiation of corneal progenitors and forcing them into continuous symmetric divisions, the Sox9 loss-of-function phenotype was replicated.

      Strengths:

      Thus, the paper shows the important role of Sox9 in the spatial regulation of asymmetric fate in the corneal epithelium and its proliferation and cell differentiation. The work is elegantly done using several models that converge on the main conclusions. It is very novel and delineates a new player in determining corneal epithelial cell fate. The experiments are well done, and the data are credible.

      Weaknesses:

      This reviewer has some minor concerns mostly related to data interpretation and the use of the LSC term.

    3. Reviewer #2 (Public Review):

      Summary, strengths, and weaknesses:

      This article by Rice et al focuses on the study of limbal epithelial stem cells (LESCs). To obtain high resolution of stem/progenitor cell populations by single-cell RNA sequencing (scRNA-seq), the authors enriched the stem/progenitors by capturing GFP+ epithelial cells undergoing mitosis using the Cyclin-B-GFP transgene. The key novelty in the paper is that they identified Sox9 as a new LSC gene that is important for the health of the cornea. They show that Sox9 is expressed by LSCs at the mRNA and protein level, and that the Sox9-GFP transgene is useful to identify LSCs in live animals. They performed lineage tracing of Sox9-Cre mice that clearly demonstrated that Sox9+ cells are true LSCs. Further, Sox9-knockouts display severe opacification of the cornea, accompanied by the transformation of the central cornea into hyperplastic epidermis - a phenotype similar to previously described Notch1 conditional knockout (cKO). By studying the lifetimes of Notch dominant-negative transgenic individual basal corneal epithelial cells, they suggest that the thickening of the central zone in the Notch model is due to the loss of asymmetric division, and from that, they infer the phenotype of the Sox2-KO. In other words, it is suggested that the increased rate and type of division in the central Sox9-null cornea produce this dramatic epithelial thickening phenotype. This claim makes sense but suggests a role for the Notch, not Sox9, and this role is relevant for the central corneal epithelial cells, and not in LESCs. So I suggest considering revising the conclusions (title of the paper) on this aspect. This would not affect the impact of the paper.

      In general, I believe that this is a very interesting and novel study that will be of interest to a broad readership. The methodology and study design are robust and elegant.

      However, in some cases, typos, text modifications, additional controls, and new experiments are suggested.

    1. eLife assessment

      This manuscript addresses a mechanism by which dopamine (DA) regulates synaptic plasticity. The authors build upon their previous finding that DA applied after a timing pattern that ordinarily induces long-term depression (LTD) now induces long-term potentiation (LTP). The new findings that this "DA-dependent LTP" involves de novo protein synthesis, a cyclicAMP signalling pathway, and calcium-permeable AMPA receptors (CP-AMPARs) are of valuable significance. The conclusions are convincing and largely supported by the evidence provided.

    2. Reviewer #1 (Public Review):

      Summary:

      In this manuscript, Fuchsberger et al. demonstrate a set of experiments that ultimately identifies the de novo synthesis of GluA1-, but not GluA2-containing Ca2+ permeable AMPA receptors as a key driver of dopamine-dependent LTP (DA-LTP) during conventional post-before-pre spike-timing dependent (t-LTD) induction. The authors further identify adenylate cyclase 1/8, cAMP, and PKA as the crucial mitigators of these actions. While some comments have been identified below, the experiments presented are thorough and address the aims of the manuscript, figures are presented clearly (with minor comments), and experimental sample sizes and statistical analyses are suitable. Suitable controls have been utilized to confirm the role of Ca2+ permeable AMPAR. This work provides a valuable step forward built on convincing data toward understanding the underlying mechanisms of spike-timing-dependent plasticity and dopamine.

      Strengths:

      Appropriate controls were used.

      The flow of data presented is logical and easy to follow.

      The quality of the data, except for a few minor issues, is solid.

      Weaknesses:

      The drug treatment duration of anisomycin is longer than the standard 30-45 minute duration (as is the 500uM vs 40uM concentration) typically used in the field. Given the toxicity of these kinds of drugs long term it's unclear why the authors used such a long and intense drug treatment.

      With some of the normalizations (such as those in S1) there are dramatic differences in the baseline "untreated" puromycin intensities - raising some questions about the overall health of slices used in the experiments.

    3. Reviewer #2 (Public Review):

      Summary:

      The aim was to identify the mechanisms that underlie a form of long-term potentiation (LTP) that requires the activation of dopamine (DA).

      Strengths:

      The authors have provided multiple lines of evidence that support their conclusions; namely that this pathway involves the activation of a cAMP / PKA pathway that leads to the insertion of calcium-permeable AMPA receptors.

      Weaknesses:

      Some of the experiments could have been conducted in a more convincing manner.

    4. Reviewer #3 (Public Review):

      The manuscript of Fuchsberger et al. investigates the cellular mechanisms underlying dopamine-dependent long-term potentiation (DA-LTP) in mouse hippocampal CA1 neurons. The authors conducted a series of experiments to measure the effect of dopamine on the protein synthesis rate in hippocampal neurons and its role in enabling DA-LTP. The key results indicate that protein synthesis is increased in response to dopamine and neuronal activity in the pyramidal neurons of the CA1 hippocampal area, mediated via the activation of adenylate cyclases subtypes 1 and 8 (AC1/8) and the cAMP-dependent protein kinase (PKA) pathway. Additionally, the authors show that postsynaptic DA-induced increases in protein synthesis are required to express DA-LTP, while not required for conventional t-LTP.

      The increased expression of the newly synthesized GluA1 receptor subunit in response to DA supports the formation of homomeric calcium-permeable AMPA receptors (CP-AMPARs). This evidence aligns well with data showing that DA-LTP expression requires the GluA1 AMPA subunit and CP-AMPARs, as DA-LTP is absent in the hippocampus of a GluA1 genetic knock-out mouse model. Overall, the study is solid, and the evidence provided is compelling. The authors clearly and concisely explain the research objectives, methodologies, and findings. The study is scientifically robust, and the writing is engaging. The authors' conclusions and interpretation of the results are insightful and align well with the literature. The discussion effectively places the findings in a meaningful context, highlighting a possible mechanism for dopamine's role in the modulation of protein-synthesis-dependent hippocampal synaptic plasticity and its implications for the field. Although the study expands on previous works from the same laboratory, the findings are novel and provide valuable insights into the dynamics governing hippocampal synaptic plasticity.

      The claim that GluA1 homomeric CP-AMPA receptors mediate the expression of DA-LTP is fascinating, and although the electrophysiology data on GluA1 knock-out mice are convincing, more evidence is needed to support this hypothesis. Western blotting provides useful information on the expression level of GluA1, which is not necessarily associated with cell surface expression of GluA1 and therefore CP-AMPARs. Validating this hypothesis by localizing the protein using immunofluorescence and confocal microscopy detection could strengthen the claim. The authors should briefly discuss the limitations of the study.

      Additional comments to address:

      (1) In Figure 2A, the representative image with PMY alone shows a very weak PMY signal. Consequently, the image with TTX alone seems to potentiate the PMY signal, suggesting a counterintuitive increase in protein synthesis.

      (2) In Figures 3A-B, the Western blotting representative images have poor quality, especially regarding GluA1 and α-actin in Figure 3A. The quantification graph (Figure 3B) raises some concerns about a potential outlier in both the DA alone and DA+CHX groups. The authors should consider running a statistical test to detect outlier data. Full blot images, including ladder lines, should be added to the supplementary data.

    1. eLife assessment

      This valuable study investigates how hearing impairment affects neural encoding of speech, in particular the encoding of hierarchical linguistic information. The current analysis provides incomplete evidence that hearing impairment affects speech processing at multiple levels, since the novel analysis based on HM-LSTM needs further justification. The advantage of this method should also be further explained. The study can also benefit from building a stronger link between neural and behavioral data.

    2. Reviewer #1 (Public Review):

      The authors are attempting to use the internal workings of a language hierarchy model, comprising phonemes, syllables, words, phrases, and sentences, as regressors to predict EEG recorded during listening to speech. They also use standard acoustic features as regressors, such as the overall envelope and the envelopes in log-spaced frequency bands. This is valuable and timely research, including the attempt to show differences between normal-hearing and hearing-impaired people in these regards.

      I will start with a couple of broader questions/points, and then focus my comments on three aspects of this study: The HM-LSTM language model and its usage, the time windows of relevant EEG analysis, and the usage of ridge regression.

      Firstly, as far as I can tell, the OSF repository of code, data, and stimuli is not accessible without requesting access. This needs to be changed so that reviewers and anybody who wants or needs to can access these materials.

      What is the quantification of model fit? Does it mean that you generate predicted EEG time series from deconvolved TRFs, and then give the R2 coefficient of determination between the actual EEG and predicted EEG constructed from the convolution of TRFs and regressors? Whether or not this is exactly right, it should be made more explicit.

      About the HM-LSTM:

      • In the Methods paragraph about the HM-LSTM, a lot more detail is necessary to understand how you are using this model. Firstly, what do you mean that you "extended" it, and what was that procedure? And generally, this is the model that produces most of the "features", or regressors, whichever word we like, for the TRF deconvolution and EEG prediction, correct? A lot more detail is necessary then, about what form these regressors take, and some example plots of the regressors alongside the sentences.<br /> • Generally, it is necessary to know what these regressors look like compared to other similar language-related TRF and EEG/MEG prediction studies. Usually, in the case of e.g. Lalor lab papers or Simon lab papers, these regressors take the form of single-sample event markers, surrounded by zeros elsewhere. For example, a phoneme regressor might have a sample up at the onset of each phoneme, and a word onset regressor might have a sample up at the onset of each word, with zeros elsewhere in the regressor. A phoneme surprisal regressor might have a sample up at each phoneme onset, with the value of that sample corresponding to the rarity of that phoneme in common speech. Etc. Are these regressors like that? Or do they code for these 5 linguistic levels in some other way? Either way, much more description and plotting is necessary in order to compare the results here to others in the literature.<br /> • You say that the 5 regressors that are taken from the trained model's hidden layers do not have much correlation with each other. However, the highest correlations are between syllable and sentence (0.22), and syllable and word (0.17). It is necessary to give some reason and interpretation of these numbers. One would think the highest correlation might be between syllable and phoneme, but this one is almost zero. Why would the syllable and sentence regressors have such a relatively high correlation with each other, and what form do those regressors take such that this is the case?<br /> • If these regressors are something like the time series of zeros along with single sample event markers as described above, with the event marker samples indicating the onset of the relevant thing, then one would think e.g. the syllable regressor would be a subset of the phoneme regressor because the onset of every syllable is a phoneme. And the onset of every word is a syllable, etc.

      For the time windows of analysis:

      • I am very confused, because sometimes the times are relative to "sentence onset", which would mean the beginning of sentences, and sometimes they are relative to "sentence offset", which would mean the end of sentences. It seems to vary which is mentioned. Did you use sentence onsets, offsets, or both, and what is the motivation?<br /> • If you used onsets, then the results at negative times would not seem to mean anything, because that would be during silence unless the stimulus sentences were all back to back with no gaps, which would also make that difficult to interpret.<br /> • If you used offsets, then the results at positive times would not seem to mean anything, because that would be during silence after the sentence is done. Unless you want to interpret those as important brain activity after the stimuli are done, in which case a detailed discussion of this is warranted.<br /> • For the plots in the figures where the time windows and their regression outcomes are shown, it needs to be explicitly stated every time whether those time windows are relative to sentence onset, offset, or something else.<br /> • Whether the running correlations are relative to sentence onset or offset, the fact that you can have numbers outside of the time of the sentence (negative times for onset, or positive times for offset) is highly confusing. Why would the regressors have values outside of the sentence, meaning before or after the sentence/utterance? In order to get the running correlations, you presumably had the regressor convolved with the TRF/impulse response to get the predicted EEG first. In order to get running correlation values outside the sentence to correlate with the EEG, you would have to have regressor values at those time points, correct? How does this work?<br /> • In general, it seems arbitrary to choose sentence onset or offset, especially if the comparison is the correlation between predicted and actual EEG over the course of a sentence, with each regressor. What is going on with these correlations during the middle of the sentences, for example? In ridge regression TRF techniques for EEG/MEG, the relevant measure is often the overall correlation between the predicted and actual, calculated over a longer period of time, maybe the entire experiment. Here, you have calculated a running comparison between predicted and actual, and thus the time windows you choose to actually analyze can seem highly cherry-picked, because this means that most of the data is not actually analyzed.<br /> • In figures 5 and 6, some of the time window portions that are highlighted as significant between the two lines have the lines intersecting. This looks like, even though you have found that the two lines are significantly different during that period of time, the difference between those lines is not of a constant sign, even during that short period. For instance, in figure 5, for the syllable feature, the period of 0 - 200 ms is significantly different between the two populations, correct? But between 0 and 50, normal-hearing are higher, between 50 and 150, hearing-impaired are higher, and between 150 and 200, normal-hearing are higher again, correct? But somehow they still end up significantly different overall between 0 and 200 ms. More explanation of occurrences like these is needed.

      Using ridge regression:

      • What software package(s) and procedure(s) were specifically done to accomplish this? If this is ridge regression and not just ordinary least squares, then there was at least one non-zero regularization parameter in the process. What was it, how did it figure in the modeling and analysis, etc.?<br /> • It sounds like the regressors are the hidden layer activations, which you reduced from 2,048 to 150 non-acoustic, or linguistic, regressors, per linguistic level, correct? So you have 150 regressors, for each of 5 linguistic levels. These regressors collectively contribute to the deconvolution and EEG prediction from the resulting TRFs, correct? This sounds like a lot of overfitting. How much correlation is there from one of these 150 regressors to the next? Elsewhere, it sounds like you end up with only one regressor for each of the 5 linguistic levels. So these aspects need to be clarified.<br /> • For these regressors, you are comparing the "regression outcomes" for different conditions; "regression outcomes" are the R2 between predicted and actual EEG, which is the coefficient of determination, correct? If this is R2, how is it that you have some negative numbers in some of the plots? R2 should be only positive, between 0 and 1.

    3. Reviewer #2 (Public Review):

      This study compares neural responses to speech in normal-hearing and hearing-impaired listeners, investigating how different levels of the linguistic hierarchy are impacted across the two cohorts, both in a single-talker and multi-talker listening scenario. It finds that, while normal-hearing listeners have a comparable cortical encoding of speech-in-quiet and attended speech from a multi-talker mixture, participants with hearing impairment instead show a reduced cortical encoding of speech when it is presented in a competing listening scenario. When looking across the different levels of the speech processing hierarchy in the multi-talker condition, normal-hearing participants show a greater cortical encoding of the attended compared to the unattended stream in all speech processing layers - from acoustics to sentence-level information. Hearing-impaired listeners, on the other hand, only have increased cortical responses to the attended stream for the word and phrase levels, while all other levels do not differ between attended and unattended streams.<br /> The methods for modelling the hierarchy of speech features (HM-LSTM) and the relationship between brain responses and specific speech features (ridge-regression) are appropriate for the research question, with some caveats on the experimental procedure. This work offers an interesting insight into the neural encoding of multi-talker speech in listeners with hearing impairment, and it represents a useful contribution towards understanding speech perception in cocktail-party scenarios across different hearing abilities. While the conclusions are overall supported by the data, there are limitations and certain aspects that require further clarification.<br /> (1) In the multi-talker section of the experiment, participants were instructed to selectively attend to the male or the female talker, and to rate the intelligibility, but they did not have to perform any behavioural task (e.g., comprehension questions, word detection or repetition), which could have demonstrated at least an attempt to comply with the task instructions. As such, it is difficult to determine whether the lack of increased cortical encoding of Attended vs. Unattended speech across many speech features in hearing-impaired listeners is due to a different attentional strategy, which might be more oriented at "getting the gist" of the story (as the increased tracking of only word and phrase levels might suggest), or instead it is due to hearing-impaired listeners completely disengaging from the task and tuning back in for selected key-words or word combinations. Especially the lack of Attended vs. Unattended cortical benefit at the level of acoustics is puzzling and might indicate difficulties in performing the task. I think this caveat is important and should be highlighted in the Discussion section.<br /> (2) In the EEG recording and preprocessing section, you state that the EEG was filtered between 0.1Hz and 45Hz. Why did you choose this very broadband frequency range? In the literature, speech responses are robustly identified between 0.5Hz/1Hz and 8Hz. Would these results emerge using a narrower and lower frequency band? Considering the goal of your study, it might also be interesting to run your analysis pipeline on conventional frequency bands, such as Delta and Theta, since you are looking into the processing of information at different temporal scales.<br /> (3) A paragraph with more information on the HM-LSTM would be useful to understand the model used without relying on the Chung et al. (2017) paper. In particular, I think the updating mechanism of the model should be clarified. It would also be interesting to modify the updating factor of the model, along the lines of Schmitt et al. (2021), to assess whether a HM-LSTM with faster or slower updates can better describe the neural activity of hearing-impaired listeners. That is, perhaps the difference between hearing-impaired and normal-hearing participants lies in the temporal dynamics, and not necessarily in a completely different attentional strategy (or disengagement from the stimuli, as I mentioned above).<br /> (4) When explaining how you extracted phoneme information, you mention that "the inputs to the model were the vector representations of the phonemes". It is not clear to me whether you extracted specific phonetic features (e.g., "p" sound vs. "b" sound), or simply the phoneme onsets. Could you clarify this point in the text, please?

    4. Reviewer #3 (Public Review):

      Summary:<br /> The authors aimed to investigate how the brain processes different linguistic units (from phonemes to sentences) in challenging listening conditions, such as multi-talker environments, and how this processing differs between individuals with normal hearing and those with hearing impairments. Using a hierarchical language model and EEG data, they sought to understand the neural underpinnings of speech comprehension at various temporal scales and identify specific challenges that hearing-impaired listeners face in noisy settings.

      Strengths:<br /> Overall, the combination of computational modeling, detailed EEG analysis, and comprehensive experimental design thoroughly investigates the neural mechanisms underlying speech comprehension in complex auditory environments.

      The use of a hierarchical language model (HM-LSTM) offers a data-driven approach to dissect and analyze linguistic information at multiple temporal scales (phoneme, syllable, word, phrase, and sentence). This model allows for a comprehensive neural encoding examination of how different levels of linguistic processing are represented in the brain.

      The study includes both single-talker and multi-talker conditions, as well as participants with normal hearing and those with hearing impairments. This design provides a robust framework for comparing neural processing across different listening scenarios and groups.

      Weaknesses:<br /> The analyses heavily rely on one specific computational model, which limits the robustness of the findings. The use of a single DNN-based hierarchical model to represent linguistic information, while innovative, may not capture the full range of neural coding present in different populations. A low-accuracy regression model-fit does not necessarily indicate the absence of neural coding for a specific type of information. The DNN model represents information in a manner constrained by its architecture and training objectives, which might fit one population better than another without proving the non-existence of such information in the other group. To address this limitation, the authors should consider evaluating alternative models and methods. For example, directly using spectrograms, discrete phoneme/syllable/word coding as features, and performing feature-based temporal response function (TRF) analysis could serve as valuable baseline models. This approach would provide a more comprehensive evaluation of the neural encoding of linguistic information.

      It is not entirely clear if the DNN model used in this study effectively serves the authors' goal of capturing different linguistic information at various layers. Specifically, the results presented in Figure 3C are somewhat confusing. While the phonemes are labeled, the syllables, words, phrases, and sentences are not, making it difficult to interpret how the model distinguishes between these levels of linguistic information. The claim that "Hidden-layer activity for same-vowel sentences exhibited much more similar distributions at the phoneme and syllable levels compared to those at the word, phrase and sentence levels" is not convincingly supported by the provided visualizations. To strengthen their argument, the authors should use more quantified metrics to demonstrate that the model indeed captures phrase, word, syllable, and phoneme information at different layers. This is a crucial prerequisite for the subsequent analyses and claims about the hierarchical processing of linguistic information in the brain. Quantitative measures such as mutual information, clustering metrics, or decoding accuracy for each linguistic level could provide clearer evidence of the model's effectiveness in this regard.

      The formulation of the regression analysis is somewhat unclear. The choice of sentence offsets as the anchor point for the temporal analysis, and the focus on the [-100ms, +300ms] interval, needs further justification. Since EEG measures underlying neural activity in near real-time, it is expected that lower-level acoustic information, which is relatively transient, such as phonemes and syllables, would be distributed throughout the time course of the entire sentence. It is not evident if this limited time window effectively captures the neural responses to the entire sentence, especially for lower-level linguistic features. A more comprehensive analysis covering the entire time course of the sentence, or at least a longer temporal window, would provide a clearer understanding of how different linguistic units are processed over time. Additionally, explaining the rationale behind choosing this specific time window and how it aligns with the temporal dynamics of speech processing would enhance the clarity and validity of the regression analysis.

    1. eLife assessment

      The study has identified a cell type in muscle that is characterized as an adipogenic progenitor cell that is capable of promoting regeneration through the action of BDNF, a prominent growth factor regulated by GDNF in Schwann cells. These results represent an important cellular explanation for nerve regeneration. The revised analysis is solid but the work remains incomplete due to a lack of evidence that BDNF is produced during the process through the action of GDNF.

    1. eLife assessment

      The study provides potentially fundamental insight into the function and evolution of daily rhythms. The authors investigate the function of the putative core circadian clock gene Clock in the cnidarian Nematostella vectensis. While it parts still incomplete, the evidence suggests that, in contrast to mice and fruit flies, Clock in this species is necessary for daily rhythms under constant conditions, but not under a rhythmic light/dark cycle, suggesting that the major role of the circadian oscillator in this species could be a stabilizing function under non-rhythmic environmental conditions.

    2. eLife assessment

      This fundamental study for the first time defines genetically the role of the Clock gene in basal metazoa, using the cnidarian Nematostella vectensis. With convincing evidence, the study provides insight into the early evolution of circadian clocks. Clock in this species is necessary for daily rhythms under constant conditions, but not under a rhythmic light/dark cycle, suggesting that the major role of the circadian oscillator in this species could be a stabilizing function under non-rhythmic environmental conditions.

    1. Author response:

      The following is the authors’ response to the current reviews.

      Reviewer #1 (Public Review):

      We thank the reviewer for his careful reading, which enabled us to improve the quality of this manuscript. We have addressed some major criticisms, and in particular, we have now included the characterization of the impact of BMP2 on other lines as well as the study of the impact of reversion of the H3.3K27M mutation (Figure 3 - figure supplement 1C-D). This control, judiciously proposed by the reviewer, seems more relevant than using mutant H3.1K27M / ACVR1 lines, given the possibility of BMP2 action via other receptors.


      The following is the authors’ response to the original reviews.

      Reviewer #1

      Summary:

      Mutational analysis of diffuse midline glioma (DMG) found that ACVR1 mutations, which up-regulate the BMP signaling pathway are found in most H3.1K27M, but not H3.3K27M DMG cases. In this manuscript, Huchede et al attempted to determine whether the BMP signaling pathway has any role in H3.3K27M DMG tumors. They found that the BMP signaling is activated to a similar level in H3.3K27M DMG cells with wild-type ACVR1 compared to ACVR1 DMG cells, likely due to the expression of BMP7 or BMP2. They went on to test whether cells treated with BMP7 or BMP2 treatments affected the gene expression and cell fitness of tumor cells with H3.3K27M mutation. They concluded that BMP2/7 synergizes with H3.3K27M to induce a transcriptomic rewiring associated with a quiescent but invasive cell state. The major issue for this conclusion is that the authors did not use the right models/controls to obtain results to support this conclusion as detailed below. Therefore, in order to strengthen the conclusion, the authors need to address the major concerns below.

      Strength:

      This paper addresses an important question in the DMG field.

      Major concerns/weakness:

      (1) All the results in Fig. 2 utilized two glioma lines SF188 and Res259. The authors should repeat all these experiments in a couple of H3.3K27M DMG lines by deleting the H3.3K27M mutation first.

      We thank the referee for his/her comments that have helped us to strengthen our conclusions. Although we were rather interested in studying how the BMP pathway can participate in installing a particular cell state at the time of expression of the K27M mutation, we have now included the characterization of the native H3.3K27M BT245 and SU-DIPGXIII cell lines, and their counterparts in which the mutation was reverted by CRISPRCas9 (Harutyunyan et al., 2019). As shown in Figure 3-figure supplement D, the growth arrest induced by BMP2 seems indeed to be specific of the K27M epigenetic context, which could also be required to settle a positive regulation loop to activate the BMP pathway, as mentioned in the Discussion.

      (2) Fig. 3. The experiments of BMP2 treatment should be repeated in other H3.3K27M DMG lines using H3.1K27M ACVR1 mutant tumor lines as controls.

      The use of mutant ACVR1 lines is interesting, but their control status seems questionable, as the addition of BMPs could have a cumulative effect on the effect of the mutation, notably by activating other receptors in the pathway. But we have now included 3 different cell lines (HSJD-DIPG-014, BT245 and SU-DIPGXIII), and observed similar impact of BMP2 with growth arrest as a readout (Figure 3-figure supplement C-D)

      Minor concerns

      Fig.2A. BMP2 expression increased in H3.3K27M SF188 cells. Therefore, the statement "whereas BMP2 and BMP4 expressions are not significantly modified (Figure 2A and Figure 2-figure supplement A-B)" is not accurate.

      The referee is absolutely right, and we have corrected this statement.

      Reviewer #2 (Public Review):

      The manuscript by Huchede et al investigates the BMP pathway in H3K27M-mutant gliomas carrying or not activating mutations in ALK2 (ACVR1). Their results in cell lines and in datasets acquired from the literature on patient tumors indicate that the BMP signaling pathway is activated at similar levels between ACVR1 wild-type and mutant tumors. The group further identifies BMP2 and BMP7 as possibly the main activators of the pathway in cells. They then show that BMP2 and 7 crosstalk with the H3 mutation and synergize to induce transcriptomic rewiring leading to an invasive cell state.

      The paper is well-written and easy to follow with a robust experimental plan and datasets supporting the claims. While previous work (acknowledged by the authors) indicated activation of BMP in H3K27M tumors, wild type for the ACVR1 mutation this paper is a nice addition and provides further mechanistic cues as to the importance of the BMP pathway and specific members in these deadly brain cancers. The effect of these BMPs in quiescence and invasion is of particular interest.

      We thank the referee for his/her supportive comments.

      A few suggestions to clarify the message are provided below 1- In thalamic diffuse midline gliomas, the BMP pathway should not be activated as it is in the pons. The authors should identify thalamic tumors in the datasets they explored and patients-derived cell lines from thalamic tumors available to investigate whether this pathway is active across all H3.3K27M mutants in the brain midline or specifically in tumors from the pons.

      The inter-patient variability observed in the level of activation of the BMP pathway may indeed be due, at least in part, to different tumor locations. However, we failed to find this information in the publicly available datasets that we used. We however included this element in the Discussion part.

      (2) There are ~20% H3.3K27M tumors that carry an ACVR1 mutation and similar numbers of H3.1K27M that are wild type for this gene. Can the authors identify these outliers in their datasets and assess the activation of BMP2 and 7 or other BMP pathway members in this context?

      We have now included the outliers present in our datasets in the legends of Figure 1B and Figure 1-figure supplement B and F. From the few samples available to document these outliers in the cohorts that we used, we have not observed major differences regarding the expression levels of BMP2/7 or BMP pathway members and have discussed the fact that it may result from the establishment in all cases of a feedback loop of activation.

      In all this is an interesting paper that provides meaningful data to pursue clinical targeting of the BMP pathway, which would be a nice addition to the field.

      We thank the reviewer for his/her supportive comments.

    1. eLife assessment

      This study provides valuable insights into the regulation of metabolic flux between glycolysis and respiration in yeast, particularly focusing on the role of inorganic phosphate. The authors propose a novel mechanism involving Ubp3/Ubp10 that potentially mitigates the Crabtree effect, offering substantial, solid evidence through a variety of well-designed assays. This study could reshape our understanding of metabolic regulation with broad biological contexts.

    2. Reviewer #2 (Public Review):

      Summary:

      Cells cultured in high glucose tend to repress mitochondrial biogenesis and activity, a prevailing phenotype type called Crabtree effect that observed in different cell types and cancer. Many signaling pathways have been put forward to explain this effect. Vengayil et al proposed a new mechanism involved in Ubp3/Ubp10 and phosphate that controls the glucose repression of mitochondria. The central hypothesis is that ∆ubp3 shift the glycolysis to trehalose synthesis, therefore lead to the increase of Pi availability in the cytosol, then mitochondrial received more Pi and therefore the glucose repression is reduced.

      Strengths:

      The strength is that the authors used an array of different assays to test their hypothesis. Most assays were well-designed and controlled.

      Weaknesses:

      The author addressed my major concerns.

    3. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      The study by Vengayil et al. presented a role for Ubp3 for mediating inorganic phosphate (Pi) compartmentalization in cytosol and mitochondria, which regulates metabolic flux between cytosolic glycolysis and mitochondrial processes. Although the exact function of increased Pi in mitochondria is not investigated, findings have valuable implications for understanding the metabolic interplay between glycolysis and respiration under glucose-rich conditions. They showed that UBP3 KO cells regulated decreased glycolytic flux by reducing the key Pi-dependent-glycolytic enzyme abundances, consequently increasing Pi compartmentalization to mitochondria. Increased mitochondria Pi increases oxygen consumption and mitochondrial membrane potential, indicative of increased oxidative phosphorylation. In conclusion, the authors reported that the Pi utilization by cytosolic glycolytic enzymes is a key process for mitochondrial repression under glucose conditions.

      Comments on revised version:

      This reviewer appreciates the author's responses addressing some of the concerns.

      (1) However, the concern of reproducibility and experimental methods applied to the study is still valid, particularly considering that many conclusions were drawn from western blot analysis. The authors used separate gel loading controls for western blot analysis, which is not a valid method. Considering loading and other errors/discrepancies during the transfer phase of the assay, the direct control should be analyzing the membrane after transfer or using an internal control antibody on the same membrane. None of the western blots are indicated with marker sizes, and it isn't very clear how many repeats there are and whether those repeats are biological or technical repeats.

      We thank the reviewer for raising this concern. This point requires detailed clarification regarding two key points: the first one regarding the use of Coomassie stained gels over internal ‘housekeeping gene’ antibodies, and the second one regarding the challenges in performing controls for western blots In case of high abundance proteins such as glycolytic enzymes.

      (1) In our western blots, we have used Coomassie stained gel as a loading control for all our western blots. This is performed by cutting one half of the gel and using it for transfer followed by blotting and using the other half for Coomassie staining. I.e. This is not two separate gels that are loaded, but the same gel. Practically, this is no different from cutting a membrane to blot with different antibodies. This method is of course valid method for normalizing western blot data, and is used by multiple studies, for the reasons mentioned below. The historical use of a ‘house-keeping’ gene as a loading control for western blotting assumes that the protein levels of these does not change under different conditions. However, this approach has multiple, severe limitations (since a ‘housekeeping gene’ is entirely contextual, and indeed), and therefore it is correct to use total protein as a loading control. This is indeed recommended for use by multiple studies (Collins et al., 2015). Coomassie staining for total protein is far more reliable than using house-keeping genes as a loading control in western blots (Welinder and Ekblad, 2011). A notable example would be GAPDH itself, which is widely used as a loading control in many studies. As is clear from our data in this manuscript, GAPDH levels itself decrease in ubp3Δ cells. Had we used GAPDH as a loading control, we wouldn’t have identified the decrease in glycolytic enzymes in ubp3Δ cells, and this story would have met with a tragic fate very early on in its inception. We have in fact be very careful with these quantitations, and even before loading samples on gels, they are first normalized using a standard protein estimation assay (Bradford), followed by normalized loading, followed by cutting the gel into two parts - one for coomassie staining and protein normalization, and the other for the western blot for the respective proteins. However, in point (2) below, we clarify on why sometimes we have to load a separate gel with normalized protein, which should resolve this point.

      (2) Glycolytic enzymes are highly abundant proteins and to achieve a signal in the linear range of western blot, the protein extracts have to be diluted (up to 25 or 50 times). As discussed under point 1, an internal control ‘housekeeping gene’ antibody is not a reliable method to use as loading control. Even if we want to use an antibody for an internal protein as a control, there are not many proteins that are as abundant as metabolic enzymes and because of this simple reason, the sample dilution results in these proteins not getting detected in the western blot since the signal will be below the limit of detection. This leaves using a separate gel loading control as the only easy to perform, reliable option.

      We would like to further highlight the fact that the changes in metabolic enzymes and ETC proteins that we observe in the ubp3 mutant by western blot, were also independently observed by large scale untargeted quantitative proteomics study by  (Isasa et al., 2015), which we cite extensively in this manuscript. Since an entirelyindependent study, using a completely different (untargeted) method has also shown very similar  changes in proteins that we observe (mitochondrial, and glycolytic enzymes), there should be no room for doubt regarding the altered glycolytic enzyme and ETC protein  levels that we discover in this study.

      None of the western blots are indicated with marker sizes

      We have clearly indicated the marker sizes in all our western blots. Separately, raw images of the blots and Coomassie stained gels have been provided with the manuscript raw data, and is therefore easily available for any interested reader.

      It isn't very clear how many repeats there are and whether those repeats are biological or technical repeats.

      We have already clearly indicated the details of each blot in the figure legends. For example “A representative blot (out of three biological replicates, n=3) and their quantifications are shown. Data represent mean ± SD.” We kindly request the reviewer to thoroughly go through the figure legends for details regarding the western blots, or any other data. We hope this addresses all the reviewer concerns regarding the credibility of our western blot results and the method of using Coomassie stained gels as loading controls in this study.

      (2) Concern regarding citing the Ouyang et al. paper is still valid. This paper is an essential implication in phosphate metabolism and is directly related to some of the findings associated with mitochondrial function, along with conflicting results, which should be discussed in the discussion section. As a reviewer, I do not request citing any paper from the authors in general; however, considering some of the conflicting results here, citing and discussing paper from Ouyang et al. will improve the interoperation/value of their findings.

      As mentioned in detail in our previous response  letter, we do not believe that the study from Ouyang et al., present ‘conflicting results’ of any kind. Nevertheless, in response to the reviewer's suggestion, we have revised the discussion section of our manuscript and added a few points that  incorporate the insights from Ouyang et al. These are in the discussion section (“It is important to highlight that our experiments, whether involving Pi supplementation or Pi limitations, maintain the cellular Pi concentration within the millimolar range and are conducted within a short timeframe (~ 1 hour). This differs significantly from Pi starvation studies, where cells are subjected to prolonged and complete Pi deprivation, triggering extensive metabolic adjustments to sustain available Pi pools, such as an increase in mitochondrial membrane potential, independent of respiration”). We trust that this modification will enhance the interested readers' understanding of our study's overarching conclusions.

      Reviewer #2 (Public Review):

      Summary:

      Cells cultured in high glucose tend to repress mitochondrial biogenesis and activity, a prevailing phenotype type called Crabree effect that observed in different cell types and cancer. Many signaling pathways have been put forward to explain this effect. Vengayil et al proposed a new mechanism involved in Ubp3/Ubp10 and phosphate that controls the glucose repression of mitochondria. The central hypothesis is that ∆ubp3 shift the glycolysis to trehalose synthesis, therefore lead to the increase of Pi availability in the cytosol, then mitochondrial received more Pi and therefore the glucose repression is reduced.

      Strengths:

      The strength is that the authors used an array of different assays to test their hypothesis. Most assays were well designed and controlled.

      Weaknesses:

      I think the main conclusions are not strongly supported by the current dataset. Here are my comments on authors' response and model.

      (1) The authors addressed some of my concerns related to ∆ubp3. But based on the results they observed and discussed, the ∆ubp3 redirect some glycolytic flux to gluconeogenesis while the 0.1% glucose in WT does not. Similarly, the shift of glycolysis to trehalose synthesis is also not relevant to the WT cells cultured in low glucose situation. This should be discussed in the manuscript to make sure readers are not misled to think ∆ubp3 mimic low glucose. It is likely that ∆ubp3 induce proteostasis stress, which is known to activate respiration and trehalose synthesis.

      But based on the results they observed and discussed, the ∆ubp3 redirect some glycolytic flux to gluconeogenesis while the 0.1% glucose in WT does not. Similarly, the shift of glycolysis to trehalose synthesis is also not relevant to the WT cells cultured in low glucose situation.

      We would like to clarify that we do not observe a redirection of glycolytic flux to gluconeogenesis in ubp3 mutant. What we observe is a rewiring of glycolytic flux into increased trehalose synthesis and PPP, and decreased glycolysis. Also, the shift of glycolysis to trehalose synthesis is relevant to WT cells cultured in low glucose. It is a well-known fact that the trehalose synthesis increases with decrease in media glucose. In case of 0.1% glucose, this increase in trehalose is not due to an increase in gluconeogenesis (since the pathways utilizing alternate carbon sources still remain repressed  in 0.1% glucose (Yin et al., 2003)), but by the increase in glycolytic flux towards trehalose. This is also supported by increase in Tps2 protein levels upon decreasing glucose concentration (Shen et al., 2023). We will also note that there are very few studies that actually estimate gluconeogenic flux in cess (and they only rely on steady state measurements). Estimating gluconeogenic flux appropriately is challenging in itself (eg. see Niphadkar et al 2024). 

      In case of glucose concentrations lower than 0.1%, the shift to trehalose synthesis might not be as relevant. We observe that the glycolysis defective mutant tdh2tdh3 cells does not show an increase in trehalose synthesis (Figure 3-figure supplement 1E). However, in this context, the decrease in the rate of GAPDH catalyzed reaction alone appears to be sufficient to increase the Pi levels (Figure 3F) even without an increase in trehalose. Therefore, there might be differences in the relative contributions of these two arms towards Pi balance, based on whether it is low glucose in the environment, or a mutant such as ubp3Δ that modulates glycolytic flux. In ubp3Δ cells, the combination of low rate of GAPDH catalyzed reaction and high trehalose will happen (based on how glycolytic flux is modulated), vs only the low rate of the GAPDH catalyzed reaction in tdh2tdh3 cells. As an end point the increase in Pi happens in both cases, but this happens via slightly differing outcomes. Also note: in terms of free Pi sources a low-glucose condition (with low glycolytic rate) is very different from a no-glucose, respiratory condition (where cells perform very high gluconeogenesis, at a rate that is an order of magnitude higher than in low glucose). In respiration-reliant conditions such as in ethanol, cells switch to high gluconeogenesis, where there is a large increase in trehalose synthesis as a default (eg see Varahan et al 2019). In this condition, trehalose synthesis could become a major source for Pi (eg see Gupta 2021). This could also support the increased mitochondrial respiration. In an ethanol-only medium, the directionality of the GAPDH reaction is itself reversed (i.e. G-1,3-BP → G-3-P). Therefore, this reaction now becomes an added source of Pi, instead of a net consumer of Pi (see illustration in Figure 3G). Therefore, a very reasonable inference is that a combination of increased trehalose and increased 1,3 BPG to G3P conversion can become a Pi source, supporting increased mitochondrial respiration in a non-glucose, respiratory medium.

      We have now clarified these points in the discussion section in the updated version of our manuscript. Lines xxx. We hope that this updated discussion section satisfies the reviewer’s concern regarding how relevant the increase in trehalose synthesis is for altered Pi balance and increased mitochondrial respiration in WT cells.

      It is likely that ∆ubp3 induce proteostasis stress, which is known to activate respiration and trehalose synthesis.

      Apart from some general changes in metabolism, there are no reports whatsoever that suggest that general proteostasis stress can results in an extensive, precise metabolic rewiring - where there is an increased in respiration, mitochondrial de-repression, precise decrease in two limiting glycolytic enzyme levels, and a precise reduction in glycolytic flux, as observed in the ubp3 mutant. If this was the case, deletion of any deubiquitinase should result in an increase in trehalose and respiration which clearly does not happen (as is already clear from the large screen shown in Figure 1)

      However, in response to this query, we performed experiments to assess the extent of proteostasis stress in ubp3 mutants. For this, we have now estimated the changes in global ubiquitination in WT vs ubp3 mutant, and compared this with conditions of moderate proteostasis stress (mild heat shock at 42C/~1hr). These data are now included in the revised manuscript as Figure 1- figure supplement 1J. Notably, our analysis reveals only very minor  alteration in global ubiquitination levels in ubp3 mutants compared to WT cells. This is in very stark contrast to  limited heat stress, where a clear increase in global ubiquitination can be easily observed. Given these data, we can conclude that there is no significant general proteostatic stress in ubp3 mutants, that could induce substantial metabolic rewiring of such precise nature.

      (2) Pi flux: it is known that vacuole can compensate the reduction of Pi in the cytosol. The paper they cited in the response, especially the Van Heerden et al., 2014 showed that the pulse addition of glucose caused transient Pi reduction and then it came back to normal level after 10min or so. If the authors mean the transient change of glycolysis and respiration, they should point that out clearly in the abstract and introduction. If the authors are trying to put out a general model, then the model must be reconsidered.

      In Van Heerden et al., the pulse addition of glucose causes transient Pi reduction due to rapid Pi consumption in glycolysis. The phosphate levels came back to normal level because of the glucose flux into trehalose synthesis releasing free Pi. This is the entire crux of the study and this is the reason why tps2 mutants which cannot synthesize trehalose exhibit a growth defect and have decreased Pi levels. As explained in detail in our early response, the cellular Pi levels are maintained by a relative balance of reactions that consume and release Pi and therefore a change in this balance can change Pi as well. Indeed, if this were not the case, the tps2 mutants would simply maintain the Pi levels similar to WT cells by increasing Pi transport from the medium, which is clearly not the case (eg see Gupta 2021).

      The cytosol has ~50mM Pi (van Eunen et al., 2010 FEBSJ), while only 1-2mM of glycolysis metabolites, not sure why partial reduction of several glycolysis enzymes will cause significant changes in cytosolic Pi level and make Pi the limiting factor for mitochondrial respiration. In response to this comment, the authors explained the metabolic flux that the rapid, continuous glycolysis will drain the Pi pool even each glycolytic metabolite is only 1-2mM. However, the metabolic flux both consume and release Pi, that's why there is such measurement of overall free Pi concentration amid the active metabolism. One possibility is that the observed cytosolic Pi level changes was caused by the measurement fluctuation.

      The measurement fluctuations that we mentioned in our previous response letter was in case of cells grown in high and low glucose, where there are multiple factors such as mitochondrial amount which complicates the Pi measurements. In case of ubp3 mutants which have a similar amount of total mitochondria as that of WT cells, there is minimal fluctuation for Pi measurement. We have done extensive standardization of mitochondrial isolation and Pi measurement in the isolated mitochondria (as explained in detail in the manuscript) to minimize any such fluctuations. 

      However, the metabolic flux both consume and release Pi, that's why there is such measurement of overall free Pi concentration amid the active metabolism

      The reviewer is correct in pointing out that metabolic flux consume and release Pi. However, in glucose grown yeast cells, the rate of glycolysis which is a Pi consuming reaction is higher than any other metabolic pathway. In fact, the glycolytic rate in glucose-grown S. cerevisiae is one of the highest ever observed in any living system. A decrease in glycolysis and an increase in trehalose therefore shifts the balance in Pi utilization and results in increased free Pi in ubp3 cells. For a more detailed theoretical reasoning on the consumption and production of Pi, see Gupta 2021.

      Importantly, the authors measured Pi inside mito for ethanol and glucose, but not the cytosolic Pi, which is the key hypothesis in their model. The model here is that the glycolysis competes with mito for free cytosolic Pi, so it needs to inhibit glycolysis to free up cytosolic Pi for mitochondrial import to increase respiration. I don't see measurement of cytosolic Pi upon different conditions, only the total Pi or mito Pi. The fact is that in Fig.3C they saw WT+Pi in the medium increase total free Pi more than the ∆ubc3, while WT decrease mito Pi compared to WT control and ∆ubc3 and therefore decrease basal OCR upon Pi supplement. A simple math of Pitotal = Pi cyto + Pi mito tells us that if WT has more Pitotal (Fig.3C) but less Pi mito (fig.5 supp 1C), then it has higher Pi cyto. This is contradictory to what the authors tried to rationalize. Furthermore, as I pointed out previously, the isolated mitochondria can import more Pi when supplemented, so if there is indeed higher Picyto, then the mito in WT should import more Pi. So, to address these contradictory points, the authors must measure Pi in the cytosol, which is a critical experiment not done for their model. For example, they hypothesized that adding 2-DG, or ∆ubp3, suppress glycolysis and thus increase the supply of cytosolic Pi for mito to import, but no cytosolic Pi was measured (need absolute value, not the relative fold changes). It is also important to specific how the experiments are done, was the measurement done shortly after adding 2-DG. Given that the cells response to glucose changes/pulses differently in transient vs stable state, the authors are encouraged to specify that.

      (1) Importantly, the authors measured Pi inside mito for ethanol and glucose, but not the cytosolic Pi, which is the key hypothesis in their model. The model here is that the glycolysis competes with mito for free cytosolic Pi, so it needs to inhibit glycolysis to free up cytosolic Pi for mitochondrial import to increase respiration. I don't see measurement of cytosolic Pi upon different conditions, only the total Pi or mito Pi.

      As clearly described in the manuscript, the key hypothesis that emerges is the role of the availability/accessibility of Pi for the mitochondria, in the context of activity. As discussed in detail in the discussion section, this can come from a combination of available Pi pools in the cytosol and increased transport of this Pi to the mitochondria. While it is true that the decreased glycolysis in ubp3 mutants frees up available Pi pools in the cytosol, measurement of cytosolic Pi in these mutants growing in log phase might not necessarily show an increased cytosolic Pi, if the Pi is being actively transported the the mitochondria at a rate higher that the WT, as indicated by the ~6 fold increase in mitochondrial Pi in ubp3 cells. This would require tools such as intracellular fluorescence based-Pi sensors that could accurately capture temporal changes in cytosolic and mitochondrial Pi following glycolytic inhibition. However, these tools are not available till date for use in yeast and measuring cytosolic Pi following glycolytic inhibition over time using colorimetric Pi assays are extremely difficult.  

      However, the reviewer does correctly state that we had not included measurement of cytosolic Pi. Since the mitochondrial Pi estimate was itself a very challenging (and critical) experiment we had originally thought that data was sufficient. We have therefore now performed a series of new experiments, where we first enrich the cytosolic fraction (without mitochondrial contamination), and estimated cytosolic Pi amounts in WT and ubp3 cells. Our Pi measurements indicate a cytosolic Pi concentration in the range of ~35 mM, which is similar to the earlier reported values in yeast. We further observe that the cytosolic Pi is about ~25% lower in ubp3 mutants (~25-27 mM) compared to WT cells (Figure 4B). As mentioned earlier, this would be consistent with higher transport of Pi from the cytosol to the mitochondria in these cells. Effectively, ubp3 cells have a total increase in cellular Pi, and with a Pi pool distribution such that there is increased Pi availability in mitochondria (Figure 4B). This further substantiates this hypothesis of an increased Pi allocation to mitochondria in ubp3 mutants. The reason for increased rate of Pi transport to mitochondria is not immediately clear, but could also come from changes in cytosolic pH - a possibility that we suggest in our discussion, and is discussed in a later section of this response letter as well.   

      (2) The fact is that in Fig.3C they saw WT+Pi in the medium increase total free Pi more than the ∆ubc3, while WT decrease mito Pi compared to WT control and ∆ubc3 and therefore decrease basal OCR upon Pi supplement. A simple math of Pitotal = Pi cyto + Pi mito tells us that if WT has more Pitotal (Fig.3C) but less Pi mito (fig.5 supp 1C), then it has higher Pi cyto. This is contradictory to what the authors tried to rationalize. Furthermore, as I pointed out previously, the isolated mitochondria can import more Pi when supplemented, so if there is indeed higher Picyto, then the mito in WT should import more Pi.

      a) “The fact is that in Fig.3C they saw WT+Pi in the medium increase total free Pi more than the ∆ubc3, while WT decrease mito Pi compared to WT control and ∆ubc3 and therefore decrease basal OCR upon Pi supplement. A simple math of Pitotal = Pi cyto + Pi mito tells us that if WT has more Pitotal (Fig.3C) but less Pi mito (fig.5 supp 1C), then it has higher Pi cyto.”

      In WT cells supplemented with external Pi (WT+Pi), there is an increased total Pi, but a decreased mitochondrial Pi. As discussed in the discussion section in the manuscript, this could be due to the supplemented Pi not being transported to mitochondria. The reviewer is correct in pointing out that as per simple math this should mean that the cytosolic Pi in WT+Pi should be high. We have now assessed cytosolic Pi upon external Pi supplementation, and this is exactly what we observe in our cytosolic Pi measurements now included in the revised manuscript (Figure 5-figure supplement 5C). There is a higher cytosolic Pi in WT+Pi (~52 mM) compared to WT cells (~35 mM) and ubp3 cells (~27 mM). We have now pointed this out in the discussion section in the revised manuscript “Notably, this increased respiration does not happen upon direct Pi supplementation to highly glycolytic WT cells, where the Pi accumulates in cytosol, without increasing mitochondrial Pi (Figure 5-figure supplement 1C).” We hope that these new data completely addresses the reviewer’s concern regarding the Pi allocations in case of WT+Pi cells.

      b) This is contradictory to what the authors tried to rationalize. Furthermore, as I pointed out previously, the isolated mitochondria can import more Pi when supplemented, so if there is indeed higher Picyto, then the mito in WT should import more Pi.

      We would like to clarify that the Pi measurements in WT+Pi absolutely do not contradict our hypothesis. Furthermore, nowhere do we claim that an increase in cytosolic Pi will increase mitochondrial Pi!! On the contrary, we explain in detail that supplementing Pi to WT cells (which increases cytosolic Pi) will not increase respiration if the increased Pi is not being transported to mitochondria. This is exactly what happens in WT+Pi, where Pi accumulates in the cytosol but does not result in increased mitochondrial Pi. The reviewer argues that if there is higher cyto Pi, mitochondria should import more Pi. This is true in case of transport via diffusion where the external concentration dictates the direction of metabolite transport, but is fundamentally wrong in case of transport of metabolites where active transporters and additional regulators are involved. This is the entire basis of the idea of metabolic compartmentalisation where  cells maintain pools of metabolites in different organelles which regulate the cellular metabolic state. A well-studied example is pyruvate, whose cytosolic concentration is high in glycolytic cells, but it's transport to mitochondria is reduced in glycolysis to maintain cytosolic fermentation. As discussed in the manuscript, a logical explanation for Pi supplementation not increasing respiration and mitochondria Pi is that there might be mechanisms in highly glycolytic cells that restrict the transport of Pi to mitochondria, thereby compartmentalizing Pi in the cytosol. One such possible mechanism is pH (discussed in a later section) and it is possible that there are other mechanisms involved. 

      In case of isolated mitochondria, Pi supplementation results in an increased respiration simply because it is an in vitro set up where we supplement metabolites such as pyruvate, malate and ADP along with phosphate to ensure that mitochondria is actively respiring and in this case Pi will be consumed since it is being used for ATP synthesis. This is entirely different from an in vivo scenario where cells are glycolytic, and mechanisms to prevent mitochondrial transport of metabolites such as pyruvate and phosphate are active. 

      c) It is also important to specific how the experiments are done, was the measurement done shortly after adding 2-DG?

      Cells were treated with 2-DG for one hour and respiration was measured. We have mentioned these details clearly in the figure legends and methods.  

      d) The most likely model to me is that, which is also the consensus in the field, is that no matter 2-DG or ∆ubp3, the cells re-wiring metabolism in both cytosol and mitochondria, and it is the total network shift that cause the mitochondrial respiration increase, which requires the increase of mito import of Pi, ADP, O2, and substrates, but not caused/controlled by the Pi that singled out by the authors in their model.

      The aim of our study is only to highlight the importance of mitochondrial Pi availability as a critical factor in controlling mitochondrial respiration. Of course this would require sufficient other factors such as ADP, substrates and oxygen. It cannot be otherwise. However, as we point out in the discussion, a major limiting factor might be Pi availability. While the altered glycolysis in ubp3 mutants might control availability of other factors such as pyruvate and ADP, this is not the focus of our study. We would also like to point out that prior studies show that even though cytosolic ADP decreases in the presence of glucose, this does  not limit mitochondrial ADP uptake, or decrease respiration, due to the very high affinity of the mitochondrial ADP transporter. This is discussed in our discussion section as well. Further we show that the levels of ETC proteins can be altered by changing Pi levels, which places Pi as a major regulator of respiration. We would like to point out once again that studies in other systems have also highlighted a major role of mitochondrial Pi availability in controlling respiration. These references are included in our manuscript (Scheibye-Knudsen et al., 2009, Seifer et al., 2015). This includes a recent study in T cells that clearly shows increased mitochondrial respiration upon overexpressing mitochondrial Pi transporter SLC25A3 alone (Wu et al., 2023). Our manuscript now in fact provides a contextual explanation of these diverse observations from other cellular systems where mitochondrial Pi transport appears to regulate respiration.

      (3) The explanation that cytosolic pH reduction upon glucose depletion/2DG is a mistake. There are a lot of data in the literature showing the opposite. If the authors do think this is true, then need to show the data. Again, it is important to distinguish transient vs stable state for pH changes.

      We observe that directly supplementing Pi to WT cells growing in high glucose does not result in higher mitochondrial Pi or increased respiration. However, supplementing Pi to WT cells increases mitochondrial respiration in the presence of glycolytic inhibitor 2-DG. We therefore merely suggest that cytosolic pH could be an additional regulator of mitochondrial Pi transport, since this will be consistent with the differences in mitochondrial Pi transport in highly glycolytic cells, and cells with decreased glycolysis ( such as 2-DG addition and ubp3 mutant). This is because in mitochondria, Pi is co-transported along with protons. Therefore, changes in cytosolic pH (which changes the proton gradient) will control the mitochondrial Pi transport (Hamel et al., 2004).  The glycolytic rate is itself a major factor that controls cytosolic pH. The cytosolic pH in highly glycolytic cells is maintained ~7, and decreasing glycolysis results in cytosolic acidification (Orij et al., 2011). Therefore, under conditions of decreased glycolysis (such as loss of Ubp3), cytosolic pH becomes acidic. Since mitochondrial Pi transport depends on the proton gradient, a low cytosolic pH would favour mitochondrial Pi transport. Therefore, under conditions of decreased glycolysis (2DG treatment, or loss of Ubp3), where cytosolic pH would be acidic, increasing cytosolic Pi might indirectly increase mitochondria Pi transport, thereby leading to increased respiration. But we certainly do leave alternate interpretations to the imagination of any reader, and are indeed open to them. These are all exciting future directions this study will enable a contextual interpretation of.

      The explanation that cytosolic pH reduction upon glucose depletion/2DG is a mistake.

      We have cited two independent studies which suggest that cytosolic pH decreases upon a decrease in glycolysis (Orij et al.,2011 ,Dechant et al., 2010). This control of cytosolic pH by the glycolytic rate has been extensively shown using glycolytic mutants, cells in low glucose and cells grown in the presence of glycolytic inhibitors. According to the reviewer, this is a mistake and

      there are a lot of data in the literature showing the opposite.

      In our literature review we did not come across any relevant studies that actually show the opposite. If the  reviewer still thinks this is a mistake, the reviewer is welcome to include some of the relevant literature that clearly shows the opposite in the comments, with actual measurements of cytosolic pH. Additionally,  the possible role of cytosolic pH in this context does not affect the conclusions of our study, and we only include this as a possibility in the discussion. Therefore, this is obviously well beyond the scope of experiments in our current study, and considering the extensive data from multiple studies that shows that cytosolic pH decreases under low glycolysis, there is no relevance  to including experiments to address the same in this study. We leave this as a point for an interested reader to think about, and it certainly can nucleate new directions of future study.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Summary of the changes

      Changes in the manuscript were made to clarify some ambiguities raised by the reviewers and to improve the report following their recommendations. A summary of the main changes is listed below:

      - The title was changed to better reflect the results of this study - Re-training the model on log transformed FACS scores.

      - Testing the specificity of the FEPS to facial expression of pain within this experimental setup by comparing it to the activation maps obtained from the Warm stimulation condition.

      - Testing for sensitization/habituation of the behavioral measures (FACS scores and pain ratings).

      - Adding a section in the discussion to better address the limitations of this study and provide potential directions for future studies.

      Other changes target areas where the original manuscript may have been ambiguous or lacked precision. To address these concerns, additional details have been incorporated, and certain terms have been revised to ensure a more precise and transparent presentation of the information.

      Public Reviews:

      Reviewer #1 (Public Review):

      Picard et al. report a novel neural signature of facial expressions of pain. In other words, they provide evidence that a specific set of brain activations, as measured by means of functional magnetic resonance imaging (fMRI), can tell us when someone is expressing pain via a concerted activation of distinctive facial muscles. They demonstrate that this signature provides a better characterization of this pain behaviour when compared with other signatures of pain reported by past research. The Facial Expression of Pain Signature (FEPS) thus enriches this collection and, if further validated, may allow scientists to identify the neural structures subserving important non-verbal pain behaviour. I have, however, some reservations about the strength of the evidence, relating to insufficient characterization of the underlying processes involved.

      We are thankful for the summary of our work. We are hopeful that the modifications made in the latest version effectively address these concerns. The changes are outlined in the summary above, and detailed in the following point-by-point response.

      Strengths:

      The study relies on a robust machine-learning approach, able to capitalise on the multivariate nature of the fMRI data, an approach pioneered in the field of pain by one of the authors (Dr. Tor Wager). This paper extends Wager's and other colleagues' work attempting to identify specific combinations of brain structures subserving different aspects of the pain experience while examining the extent of similarity/dissimilarity with the other signatures. In doing so, the study provides further methodological insight into fine-grained network characterization that may inspire future work beyond this specific field.

      We are thankful for the positive comments.

      Weaknesses:

      The main weakness concerns the lack of a targeted experimental design aimed to dissect the shared variance explained by activations both specific to facial expressions and to pain reports. In particular, I believe that two elements would have significantly increased the robustness of the findings:

      (1) Control conditions for both the facial expressions and the sensory input. An efficient signature should not be predictive of neutral and emotional facial expressions (e.g., disgust) other than pain expressions, as well as it should not be predictive of sensations originating from innocuous warm stimulation or other unpleasant but non-painful stimulation.

      We do recognize the lack of specificity testing for the FEPS, especially towards negative emotional facial expressions. This would be relevant to test given the behavioural overlap between the facial expressions of pain and disgust, fear, anger, and sadness (Kunz et al., 2013; Williams, 2003). The experimental design used in this study did not include other negative states. However, we fully support the necessity of collecting data throughout those conditions, and we believe that the present study highlights the importance of such a demonstration. Future research should involve recording facial expressions while exposing participants to stimuli that elicit a range of negative emotions but, to our knowledge, such combination of fMRI and behavioural data is currently unavailable. As raised by the reviewer, this approach would allow us to assess the specificity of the FEPS to the facial expression evoked by pain compared to different affective states. We would like to emphasise that specificity and generalizability testing is a massive amount of work, requiring multiple studies to address comprehensively. A Limitations paragraph addressing this research direction has been added to the Discussion. A conclusion was added to the abstract as follows: “Future studies should explore other pain-relevant manifestations and assess the specificity of the FEPS against other types of aversive or emotional states.”

      (2) Graded intensity of the sensory stimulation: different intensities of the thermal stimulation would have caused a graded facial expression (from neutral to pain) and graded verbal reports (from no pain to strong pain), thus offering a sensitive characterisation of the signal associated with this condition (and the warm control condition).

      However, these conditions are missing from the current design, and therefore we cannot make a strong conclusion about the generalisability of the signature (regardless of whether it can predict better than other signatures - which may/may not suffer from similar or other methodological issues - another potential interesting scientific question!). The authors seem to work on the assumption that the trials where warm stimulation was delivered are of no use. I beg to disagree. As per my previous comment, warm trials (and associated neutral expressions) could be incorporated into the statistical model to increase the classification sensitivity and precision of the FEPS decoding.

      The experience of pain can fluctuate for a fixed intensity or after controlling statistically for the intensity of the stimulation (Woo et al., 2017). Consistent with this, the current study focused on spontaneous facial expression in response to noxious thermal stimuli delivered at a constant intensity that produced moderate to strong pain in every participant. As the reviewer points out, this does not allow us to characterise and compare the stimulus-response function of facial expression and pain ratings. The advantage of the approach adopted is to maximise the number of trials where facial expression is more likely to occur, while ensuring that changes in facial expression and pain ratings are not confounded with changes in stimulus intensity. The manuscript has been revised to clarify that point. However, we do agree that it would be interesting to conduct more studies focusing on facial expression in response to a range of stimulus intensities. This discussion has been added to the Limitations paragraph.

      Furthermore, following the reviewer’s suggestion, we performed complementary analyses on the warm trials in the proposed revisions. The dot product (FEPS scores) between the FEPS and the activation maps associated with the warm condition was computed. A linear mixed model was conducted to investigate the association between FEPS scores and the experimental condition (warm vs pain). The trials in the pain condition were divided into two conditions: null FACS scores (painful trials with no facial response; FACS scores = 0) and non-null FACS scores (painful trials with a facial response; FACS > 0). The details of this analysis have been added to the manuscript (see Response of the FEPS to pain and warm section in the Methods; lines 427 to 439) as well as the corresponding results (see Results and Discussion; lines 138 to 158). The FEPS scores were larger in the pain condition where a facial response was expressed, compared to both the pain condition without facial expression and the warm condition. These results confirmed the sensitivity of the FEPS to facial expression of pain.

      Reviewer #2 (Public Review):

      Summary:

      The objective of this study was to further our understanding of the brain mechanisms associated with facial expressions of pain. To achieve this, participants' facial expressions and brain activity were recorded while they received noxious heat stimulation. The authors then used a decoding approach to predict facial expressions from functional magnetic resonance imaging (fMRI) data. They found a distinctive brain signature for pain facial expressions. This signature had minimal overlap with brain signatures reflecting other components of pain phenomenology, such as signatures reflecting subjective pain intensity or negative effects.

      We appreciate this concise and accurate summary of our study.

      Strength:

      The manuscript is clearly written. The authors used a rigorous approach involving multivariate brain decoding to predict the occurrence and intensity of pain facial expressions during noxious heat stimulation. The analyses seem solid and well-conducted. I think that this is an important study of fundamental and clinical relevance.

      Weaknesses:

      Despite those major strengths, I felt that the authors did not suffciently explain their own interpretation of the significance of the findings. What does it mean, according to them, that the brain signature associated with facial expressions of pain shows a minimal overlap with other pain-related brain signatures?

      We express our sincere gratitude for the valuable insights and constructive comments on the strengths and weaknesses of the current study. We thank reviewer 2 for the encouragement to reinforce our interpretation of the significance of the findings, while acknowledging the limitations raised by the three reviewers.

      A few questions also arose during my reading.

      Question 1: Is the FEPS really specific to pain expressions? Is it possible that the signature includes a facial expression signal that would be shared with facial expressions of other emotions, especially since it involves socio-affective regulation processes? Perhaps this question should be discussed as a limit of the study?

      We acknowledge this limitation as outlined in response to Reviewer #1. We have incorporated a Limitations paragraph to provide a more in-depth discussion of this limitation and to explore potential future avenues (lines 225 to 268). Again, please note that the demonstration of specificity is an incremental process that requires a systematic comparison with other conditions where facial expressions are produced without pain. A concluding sentence was added to the abstract to encourage specificity testing in future studies. as indicated above.

      Question 2: All AUs are combined together in a composite score for the regression. Given that the authors have other work showing that different AUs may be associated with different components of pain (affective vs. sensory), is it possible that combining all AUs together has decreased the correlation with other pain signatures? Or that the FEPS actually reflects multiple independent signatures?

      The question raised is consistent with the work of Kunz, Lautenbacher, LeBlanc and Rainville (2012), and Kunz, Chen and Rainville (2020). In the current study, the pain-relevant action units were combined in order to increase the number of trials where a facial response to pain was expressed, thus enhancing the robustness of our analyses. Given the limited sample size, our current dataset is unfortunately insufficient to perform such analysis as there would not be enough trials to look at the action units separately or in subgroups. While the approach of combining the different AUs has proven to be valid and useful, we recognize the value of investigating potential independent signatures associated with the different AUs within the FEPS, and examining whether those signatures can lead to more similar patterns compared to previously developed pain signatures. This discussion has been included in the Limitations paragraph in the Discussion (lines 225 to 268).

      Question 3: Is facial expressivity constant throughout the experiment? Is it possible that the expressivity changes between the beginning and the end of the experiment? For instance, if there is a habituation, or if the participant is less surprised by the pain, or in contrast if they get tired by the end of the experiment and do not inhibit their expression as much as they did at the beginning. If facial expressivity changes, this could perhaps affect the correlation with the pain ratings and/or with the brain signatures; perhaps time (trial number) could be added as one of the variables in the model to address this question.

      The concern raised by the reviewer is legitimate. We conducted a mixed-effects model to assess the impact of successive trials and runs on facial expressivity. Results indicate that the FACS scores did not change significantly throughout the experiment, suggesting no notable effect of habituation or sensitization on the facial expressivity in our study. Details about the analysis and the results have been added to the Facial Expression section in the Methods (lines 335 to 346).

      Reviewer #3 (Public Review):

      In this manuscript, Picard et al. propose a Facial Expression Pain Signature (FEPS) as a distinctive marker of pain processing in the brain. Specifically, they attempt to use functional magnetic resonance imaging (fMRI) data to predict facial expressions associated with painful heat stimulation. The main strengths of the manuscript are that it is built on an extensive foundation of work from the research group, and that experience can be observed in the analysis of fMRI data and the development of the machine learning model. Additionally, it provides a comparative account of the similarities of the FEPS with other proposed pain signatures. The main weaknesses of the manuscript are the absence of a proper control condition to assess the specificity of the facial pain expressions, a few relevant omissions in the methodology regarding the original analysis of the data and its purpose, and a biased interpretation of the results.

      I believe that the authors partially succeed in their aims, as described in the introduction, which are to assess the association between pain facial expression and existing pain-relevant brain signatures, and to develop a predictive brain activation model of the facial responses to painful thermal stimulation. However, I believe that there is a clear difference between those aims and the claim of the title, and that the interpretation of the results needs to be more rigorous.

      We wish to express our appreciation for the insightful and constructive critique provided. The limitation pertaining to the absence of specificity testing had been addressed in response to Reviewer #1, and it has been incorporated into the manuscript (lines 251 to 258).

      The commentary made by Reviewer #3 has drawn our attention to a critical concern, namely the potential misalignment between the study findings and our original title. Consequently, we have changed the title to “A distributed brain response predicting the facial expression of acute nociceptive pain”. We also revised the interpretation of the results in the discussion section and we have added a section on limitations.

      Recommendations for the Authors:

      Reviewer #1 (Recommendations For The Authors):

      I hope the following comments will be useful to improve the manuscript.

      Abstract

      I felt the abstract could be more clear in terms of experimental or scientific questions, hypotheses/expectations, and findings. I also feel the abstract should briefly support the conclusive claim ("is better than...": how better? Or according to what criterion? This may be more relevant than the final conclusive general sentence that does not specifically address the significance of the findings).

      The abstract was revised to reinforce the functional perspective adopted to interpret brain activity produced by noxious stimuli and predicting various pain-relevant manifestations. We also mention explicitly the other pain-relevant signatures against which the FEPS is compared in this report, and we added a concluding sentence highlighting the importance of assessing the specificity of the FEPS in future studies.

      Introduction - background and rationale

      I would postpone the discussion around pain signature and anticipate the one about the brain mechanisms of facial expressions of pain. This will allow you to reinforce the logical flow of rationale, literature gap/question, why the problem is important, and study aims. Only then go for a review of relevant literature on signatures before providing a more specific final paragraph about the study-specific questions, expectations, and implementation. At the moment this is limited to a single very descriptive short paragraph at the end of the intro.

      The introduction was structured to guide the readers through a comprehensive understanding of different pain neurosignatures. The introduction aimed to establish a robust rationale for the subsequent analyses detailed in the results section. Indeed, the presentation of that literature ensured that the discussion around pain signatures is contextualised within a broader continuous framework. We acknowledge the reviewer’s comment on the limited description of the brain mechanisms of facial expression of pain. However, this was addressed in several previous reports of our laboratory (Kunz et al. 2011; Vachon-Presseau et al. 2016; Kunz, Chen, and Rainville 2020). We have added some more details about the brain mechanisms of facial expression, and highlighted those references in the first paragraph of the introduction.

      Methods and Results

      (1) Was there any indication of power based on the previous work or the other signature papers? If yes, how that would inform the present analysis?

      The NPS was trained on 20 participants that experienced 12 trials at each of four different intensities. The assessment of the effect sizes was performed on the Neurological Pain Signature in Han et al. (2022). That study revealed a moderate effect size for predicting between-subject pain reports, and a large one for predicting within-subject pain reports. We trained our model on 34 participants that underwent 16 trials. We expected our results to show a smaller effect size as the current experimental design only allowed us to examine spontaneous changes in the facial expression, as noted in the comments made by Reviewer #1. However, the best way to calculate the unbiased effect size of the results presented in the current study would be to test the unchanged model on new independent datasets (see Reddan, Lindquist, and Wager, 2017). Unfortunately, such datasets do not currently exist.

      (2) I would clarify to the reader what is meant by normal range of thermal pain and why is this relevant. Also, I did not find data about this assessment nor about the assessment of facial expressiveness (or reference to where it can be found).

      We changed this formulation to “All participants included in this study had normal thermal pain sensitivity” and we added a few references. By targeting a healthy population with normal thermal pain sensitivity, our study sought to identify a predictive brain pattern related to facial expression evoked by typical responses to pain that could eventually be generalised to other individuals from the same population. Details about the assessment of facial expressiveness have been added in the appropriate section in the Methods.

      (3) That pain ratings are only weakly associated with facial responses is, in its own right, an interesting finding, as a naïve reader would expect the two to be highly positively correlated. I'd suggest discussing this aspect (in reference to previous research) as it is interesting on both theoretical and empirical grounds.

      The likelihood and the strength of pain facial expression generally increase with pain ratings in response to acute noxious stimuli of increasing physical intensities, thereby leading to a positive association between the two responses that is driven by the stimulus. However, the poor correlation or the dissociation between facial pain expression and pain rating is a very well known phenomenon that can be demonstrated easily using experimental methods where the stimulus intensity is held constant and spontaneous fluctuations are observed in both facial expression and pain ratings. This result was not discussed in the current manuscript as it was already addressed in the work of Kunz et al. (2011) and Kunz, Karos and Vervoot (2018). We added the references to these studies in the revised manuscript (lines 330 to 334).

      (4) It may be worth having CIs throughout the whole set of analyses.

      Thanks for the suggestions, this was an oversight. The confidence intervals have been added in the manuscript where applicable.

      (5) I would clarify if there are two measures of the brain signature: dot-product and activation map. Relatedly, I cannot find where the authors explained what "FEPS pattern expression scores". Can the authors please clarify?

      The clarification has been added in the manuscript (lines 413 to 414).

      (6) There seems to be the assumption that the relationship between pain-relevant brain signatures and facial expressions of pain would be parametric and linear. However, this might not hold true. Did the authors test these assumptions?

      We indeed decided to use a linear regression technique (i.e. LASSO regression) to model the association between the brain activity and the facial expression of pain. The algorithm choice was mainly based on the simplicity and the interpretability of that approach, and our limited number of observations. The choice was also coherent with previous studies in the domain (e.g. Wager et al., 2011; Wager et al., 2013; Krishnan et al. 2016; Woo et al., 2017). Using a linear model, we were able to predict above chance level the facial expression evoked by pain using the fMRI activation. However, it is legitimate to think that more complex non linear models can better capture the brain patterns predictive of that behavioural manifestation of pain.

      (7) Did the authors assess whether the FACS were better to be transformed/normalised? More generally, I would report any data assessment/transformation that has not been reported.

      Thank you for this highly relevant suggestion. FACS scores were indeed not normally distributed and the analyses were conducted again to predict the log transformed FACS scores. This transformation was effective to normalize the distribution (skewness = 0.75, kurtosis = -0.84). The predictive model was confirmed on transformed data.

      (8) Page 12: I am not clear on whether all the signatures are included in the same model (like a multiple regression) or if separate regressions are calculated per signature. The authors seem to imply that several regressions have been computed (possibly one per comparison with each signature?).

      The correlation between the FACS scores and the pain-related signatures was computed separately for each signature. This information has been clarified.

      (9) MVPA: See my main comment about warm trials and experimental/statistical design. For example, the LASSO regression model for the pain trials could be compared with a model using warm trials besides (or instead of) the unfitted model. Otherwise, add the warm trials as another predictor or within the subject level in a dummy fixed factor comprising pain and warm trials.

      The inclusion of warm trials in the model training would be inconsistent with the goal of the main analysis to predict the facial expression of pain when a noxious pain stimulus is presented. Secondary analyses were conducted to compare the response of the FEPS to the warm trials compared to noxious pain trials. The dot product between the FEPS and the activation maps (FEPS scores) associated with the warm condition was computed. A linear mixed model was conducted to investigate the association between FEPS scores and the experimental condition (warm vs pain). Additional contrasts compared the warm trials with the pain trials with and without pain facial expression. The details of this analysis have been added to the manuscript (see Response of the FEPS to pain and warm in the Methods) as well as the corresponding results (see Results and Discussion).

      (10) I would clarify for the reader why the separate M1 analysis has been run. Although obvious, I feel the reader would benefit from the specific hypothesis about this control analysis being spelled out together with the other statistical hypotheses within the statistical design in a more streamlined manner.

      We extended the discussion on the rationale of that analysis and its interpretation taking into account the most recent results using the log transformed FACS scores (lines 125 to 133).

      (11) The mixed model aimed to assess the relationship between pain ratings FEPS scores and facial scores is a crucial finding. I believe it speaks to the importance of a more complete design, which I already highlighted. I have a couple of technical questions: did the authors assess random slopes too? And, what was the strategy used to determine the random effects structure?

      The linear mixed model considered the participants as a random effect, with random intercepts, considering the grouping structure in our data (i.e., each participant completed multiple trials). The reported results in the original manuscript were considering fixed slopes. However, following the reviewer’s comment, we re-computed the mixed linear models allowing the slopes to vary according to the intensity ratings. The results were changed in the manuscript to represent the output of those models.

      (12) The text from lines 63 to 67 could go in the methods.

      We decided to include those lines within the Result and Discussion section to give the reader more specification about the FACS scores, as this term is subsequently referenced in the following part of the Results and Discussion section. We are concerned that putting this information only in the Methods section would disrupt the reading.

      Reviewer #2 (Recommendations For The Authors):

      p. 4-5. When you report the positive weight clusters, you follow up with a sentence specifying which cognitive processes those brain regions are typically associated with. However, when you report the negative weight clusters, you do not specify the cognitive processes typically associated with those brain areas. I think that providing that information would be helpful to the readers.

      Thanks for noticing this omission. The information has been added in the most recent version of the manuscript (lines 119 to 121).

      p. 9. You specify that the degree of expressiveness of participants was evaluated. How did you evaluate expressiveness? Did you use this variable in your analyses? Were participants excluded based on their degree of expressiveness?

      Details about the assessment of facial expressiveness have been added in the appropriate section in the Methods (lines 285 to 289).

      p. 10. You explain that two certified FACS-coders evaluated the video recordings to rate the frequency of AUs. Could you please provide more details about the frequency measure? I think that there are different ways in which this could have been done. For instance, were the videos decomposed into frames, and then the frequency measured by summing the number of frames in which the AU occurred? Or was it "expression-based", so one occurrence of an AU (frequency of 1) would correspond to the whole period between its activation onset and offset? Both ways have pros and cons. For example, if the frequency represents the number of frames, then it controls for the total duration of the AU activation within a trial (pro); but if there were multiple activations/deactivations of the AU within one trial, this will not be controlled for (con). And vice-versa with the second way of calculating frequency.

      Details about the frequency scores have been added to the manuscript (lines 315 to 319).

      p. 11. When you explained how you calculated the association between the facial expression of pain and pain-related brain signatures, I felt that there was some information missing. Did you use the thresholded maps (available in the published articles), or did you somehow have access to the complete, voxel-by-voxel, raw regression coefficient maps?

      The unthresholded maps were used. The information has been clarified in the latest version of the manuscript, as well as the details about the availability of the maps (see Data Availability section at the end of the manuscript).

      Reviewer #3 (Recommendations For The Authors):

      Format

      The authors will notice that many observations about the manuscript are related to missing information and a lack of graphical representations. I believe the topic and the content of the manuscript are too complex to condense into a short report.

      Title

      The claim of the title is simply not substantiated by the content of the manuscript. Demonstrating that the FEPS is a distinctive (i.e., specific) marker of pain processing requires a substantially different experimental design, with more rigorous controls and a broader set of painful stimulations. The manuscript would benefit from a more accurate title.

      We agree that the title could better align with our findings. We modified the title accordingly : “A distributed brain response predicting the facial expression of acute nociceptive pain”.

      Abstract

      I find it puzzling that the authors claim that there is limited knowledge of the neural correlates of facial expression of pain given what they describe in the first paragraph of the introduction. Besides, they propose to reanalyze a dataset that has been extensively described in Kunz et al. (2011), which is unlikely to provide any new significant information.

      We respectfully disagree with that comment. We considered that three articles (i.e., Kunz et al., 2011; Vachon-presseau et al., 2016; Kunz, Chen and Rainville, 2020) on the topic do constitute limited knowledge, especially if we compare it to the very large body of literature on the neural correlates associated with pain ratings. Except for these three studies, all the other citations pertain to behavioral studies on facial expression of pain, and do not examine the brain activity related to it. Furthermore, we believe that the complementary nature of the analyses performed in Kunz et al. (2011) and in this manuscript offers new insights into our understanding of facial expression in the context of pain. Indeed, the multivariate approach used in this study addresses some limitations present in Kunz et al. (2011) univariate analyses, mainly that it provides a quantifiable way to compare the similarity between different predictive patterns (Reddan and Wager, 2017). We submit that the assessment of the FEPS against several other pain-relevant signatures provides new and important information.

      Furthermore, the abstract does not clearly state the aim, and the first line of the results does not match what the authors claim in the preceding line. The take-home message (last sentence) introduces the concept of a biomarker, which, as stated before, cannot be validated with the current data/experimental design. To put it in plain words, a given facial expression (or a composite score derived from a combination of expressions) cannot be a specific biomarker for pain, because a person can always mimic the same expression without feeling pain. Whether a given facial expression can be predicted from brain activity is a different issue, and whether that prediction can differentiate between painful and non-painful origins of the facial expression is another different issue. Unfortunately, neither of those issues can be tested with the current data/experimental design. The abstract would improve if the authors would circumscribe to what they actually tested, which is accurately described in the last sentence of the Introduction.

      The abstract was revised accordingly. The term ‘biomarker’ was used in accordance with preceding studies in the field (see Reddan and Wager, 2017; Lee et al., 2021). Please note that we applied the same reasoning to fluctuations in pain expression as previous studies have applied to pain ratings. Of course, we can not dismiss the possibility of someone mimicking facial expressions. Similar reasoning applies to subjective reports, as individuals can intentionally overestimate their pain experience conveyed through verbal reports. This is another case of specificity testing that cannot be addressed in the present study (see new conclusion of the abstract and discussion of limitations). The challenge of pain assessment is a classical problem within both the scientific and the clinical literature. Here, we suggest that the consideration of multiple manifestations of pain is necessary to address this challenge and will provide a more comprehensive portrait of pain-related brain function.

      Introduction

      I believe that the Introduction would benefit from a strict definition of what is a marker/biomarker/neuromarkers (all those terms are used in the manuscript) and what are its desirable features (validity, reliability, specificity, etc.). I also believe that the Introduction (and the rest of the text) would benefit from a critical assessment of the term "signature". The Introduction describes four existing "signatures", all of them differing in the experimental condition in which acute nociceptive pain is studied, and proposes a fifth one. Keeping with the analogy, I'm wondering whether they should be called (pain) "signatures" if there is a different one for each experimental acute pain condition, and they are so dissimilar between them when they are tested on the same condition (this dataset).

      The last part of that comment raises fundamental methodological potential limitations that should be addressed in more depth in another article. That point goes beyond the scope of a research article. Regarding the stability aspect of the signatures, most of the signatures have not been studied extensively. It is thus difficult to currently assess their reliability. However, Han et al. (2022) showed high within-individual test-retest reliability for the NPS across eight different studies. Given that pain is a multidimensional experience, it is not surprising to find different patterns of activation predictive of different aspects or dimensions of the pain experience (see Čeko et al., 2022 for a similar discussion applied to negative affect).

      The authors state that "As an automatic behavioral manifestation, pain facial expression might be an indicator of activity in nociceptive systems, perceptual and evaluative processes, or general negative affect." Doesn't it reflect all three of them? (and instead of or?) Why "might"?

      The original sentence has been modified as follows: “As an automatic behavioral manifestation, pain facial expression is considered to be an indicator of activity in nociceptive systems, and to reflect perceptual and affective-evaluative processes” (lines 65 to 67).

      Methods

      The pain scale should be described. Kunz et al. used a 0-100 scale, where 50 was the pain threshold. This is crucial to interpret the 75-80/100 score for the painful thermal intensity.

      The description of the pain scale has been added to the manuscript (lines 299 to 300).

      Ratings for warm and painful temperatures should be reported (ideally plotted with individual-trial/subject data). In the same line of reasoning, FACS scores should be reported as well (ideally plotted with individual-trial/subject data). It would be interesting to explore the across-trial variability of pain ratings and FACS scores. That is, do people keep giving the same ratings and making the same facial expression after 16 trials? How much variability is between trials and between subjects?

      The point raised in that comment was already addressed in response to a comment made by Reviewer #1 (also see the new Figures S2 and S4; see also lines 335 to 346).

      How come only painful trials are analyzed? What if the FEPS signature was the same for warm and painful stimulation, thus reflecting the settings (fMRI experiment, stimulation, etc.) rather than the brain response to the stimuli?

      The point raised in that comment was already addressed in response to a comment made by Reviewer #1. There was no pain expression in the warm trials and the FEPS shows no response to warm trials. This is now illustrated in the new Figure S4B (see also lines 138 to 158).

      The authors propose to predict the trial-by-trial FACS composite score from the pain ratings using a LMM. However, it is interesting that they aim for an almost constant within- and between-subject pain score (75-80/100) as stated in the Methods. This should theoretically render the linear model invalid since its first (and main) assumption would be that FACS should vary linearly with the pain score. Even if patients were not aware that the temperatures were constant across trials, the variation in pain scores should be explained by random noise for a constant stimulation intensity.

      Reviewer #3 raises an important point that we need to clarify. Contrary to the expectation that FACS responses should be strongly correlated to pain ratings, we posited that these response channels depend at least in part on separate brain networks that may be differentially sensitive to a variety of modulatory mechanisms (attention, emotion, expectancy, motor priming, social context, etc.). This implies that part of the variance in FACS is independent from pain ratings. We, therefore, consider what Reviewer #3 refers to as random noise to be relevant and meaningful fluctuations reflecting endogenous processes influencing one’s experience of pain and differentially affecting various output responses.

      I noticed that fMRI data was analyzed with SPM5 in the original paper (Kunz et al., 2011) and with SPM8 in this manuscript. Was fMRI data re-processed for this manuscript? Were there any differences between the original analysis and this one that might induce changes in the interpretation of results?

      The data were indeed re-processed using SPM8, which was the most recent version available when we started the analyses reported here. We used trial-by-trial activation maps for MVPA, which differs from what was used in the previous study (contrast maps at the level of the conditions, not the trials). We have no reason to believe that the different versions will change the message of this manuscript since those versions do not differ significantly in terms of the fMRI preprocessing pipeline (see SPM8 release notes; https://www.fil.ion.ucl.ac.uk/spm/software/spm8/). Furthermore, the aim of this present study is not to compare the different analysis parameters implemented in SPM5 vs SPM8.

      What is the rationale for including PVP in the comparison among signatures? The experimental settings in which it was devised are distant from those described here.

      The inclusion of the PVP was aimed at enhancing our comparative analysis with the FEPS, as we sought to investigate the potential functional meaning of the FEPS. The PVP was developed to capture the aversive value of pain, a dimension that is conceptually proximal to the interpretation of the facial expression as a manifestation of the affective response to nociceptive pain.

      The LASSO-PCR approach is, in my opinion, not a procedure for (brain) decoding in this context. It is accurately described in the section title as a method for multivariate pattern analysis, or as a variable selection and regularization method for a prediction model. Here, brain activity in specific areas related to pain processing can hardly be described as "encoded", and the method just helps select those activations relevant for explaining a certain outcome (in this case, facial expressions).

      We understand the point made by reviewer #3. The term brain decoding was changed for multivariate pattern analysis in the latest version of the manuscript.

      Details are missing with regards to the dataset split into training, validation, and testing.

      Details about the training and testing procedure were added in the manuscript (lines 383 to 385).

      This might just be ignorance from me, so I apologize in advance, but what are "contrast" fMRI images? They are mentioned three times in the text but not really described. Are they the "Pain > Warm" contrasts from the original paper?

      We apologize for any confusion caused by the use of the term “contrast images” which suggests a direct comparison between two experimental conditions. We have replaced “contrast images” with “activation maps” to provide a more accurate description of the nature of the data used in the multivariate pattern analysis (lines 388 to 389).

      In the "Facial expression" section, the authors run an LMM to test the association between pain ratings (response variable) and facial responses (explanatory variable). If I understand correctly, in the "Multivariate pattern analysis" section they test the association between facial composite scores (response variable) and pain ratings (explanatory variable), but they obtain different results.

      The analyses were recomputed on the log transformed data, as mentioned previously in the response to reviewers 1-2. The first model (in the “Facial expression” section) used the log transformed FACS scores as a dependent variable, the pain ratings as the fixed effect, and the participants as the random effect. The results of that analysis suggested that the transformed facial expression scores were not significantly associated with the pain ratings (p = .07). The second model uses both the FEPS pattern expression scores and pain ratings as fixed effects to predict facial responses. This analysis showed the significant contribution of the FEPS to the prediction of FACS scores (p < .001) and no significant effect of the pain ratings. However, a significant interaction was found (p = .03) suggesting that the prediction of the pain facial expression by the FEPS may vary with pain ratings (i.e. moderator effect). Those results have been clarified in the “Multivariate pattern analysis” section in the Methods (lines 416 to 426).

      In this same section, what are "FEPS pattern expression scores"? They are used three times in the text, but I could not find their description.

      The FEPS pattern expression scores correspond to the dot product between the trial-by-trial activation maps and the unthresholded FEPS signature. This information has been added to the manuscript (lines 413 to 414).

      It would not be far-fetched to hypothesize that FACS scores could be predicted using solely activity from the motor cortex. The authors attempted to do this, but only with information from M1. Why did they not use the entire motor cortex, or better, regions of the motor cortex directly linked with the AUs described in the manuscript?

      The selection of the primary motor area (M1) was based on the results found in Kunz et al. (2011). In this study, M1 showed the strongest correlation with facial expression of pain. There are numerous possibilities of combinations of multiple brain regions considering a variety of criteria based on distributed networks involved in motor, affective, or pain-related processes. We limited our exploration to the region with the strongest hypothesis due to practical feasibility concerns.

      Results and Discussion

      As a general recommendation, results should present individual data whenever possible. For example, the association between signatures and facial expression should be plotted using scatterplots.

      We have added figures showing individual data when it was applicable (Figure S2; Figure S4).

      The authors state that the LASSO-PCR model accounts for the facial responses to pain. I believe this is an overstatement, considering:

      - A Pearson's r of 0.49 is usually considered low/weak correlation (moderate at best). In the same line, an R2 of 0.17 means that only 17% of the variance is explained by the model.

      More nuanced interpretation of the results has been added to the discussion. A section has been added to highlight the limitations of the study.

      - Figure 1 needs to display individual subject data and the ideal regression line.

      The model was trained using a k-fold cross-validation procedure. The regression lines thus represent the model’s prediction for each one of the 10 folds (i.e. each fold is trained and tested on a different subset of the data). A scatter plot including the ideal regression line computed across all trials and subjects was added in supplementary material to illustrate the relation between the FACS scores and the FEPS pattern expression scores (Figure S4).

      - Looking at Figure 1, it is clear that the model has an intercept different from zero. This means that when the FACS score was zero (i.e., volunteers did not make any distinguishable facial expression), the model predicted a score larger than zero. This is not discussed in the manuscript, and in simple terms, it means that there are brain activation patterns when no discernible facial expression is being made by the volunteers. In the original paper by Kunz et al., two groups of subjects were categorized, and one of them was a facially low- or non-expressive group (n=13). This fact is not even mentioned in the manuscript.

      The categorization in the previous report (Kunz et al., 2012) was based on a pre-experimental session. All subjects were included in the current analysis. This is now indicated in the Methods (lines 287 to 289).

      - On the other end of the range in Figure 1, differences between the FACS scores near the maximum range (40) are underestimated by 23 to 33 points! I guess that the RMSE is smaller (6-7 points), because many FACS scores are concentrated on the low end of the scale.

      This is a very interesting comment. A section discussing the limits of the model to predict the lower and higher FACS scores has been added in the manuscript (lines 232 to 250).

      It is of course acceptable to interpret the low similarity between signatures as a sign that each signature describes a different mechanism related to pain processing. However, I believe that a complete discussion should contemplate other competing hypotheses. Considering that all signatures were developed using a similar painful thermal stimulation protocol, it is reasonable to expect larger similarities between signatures. The fact that they are so dissimilar could be a reflection of model overfit, i.e., all these signatures are just fitted to these particular experimental protocols and data, and do not generalize to brain mechanisms of pain processing.

      We appreciate the pertinent observation. We have included a limitations section in which we discussed, among other considerations, the possible overfitting of models and the necessity of pursuing generalizability studies (lines 225 to 268).

    2. eLife assessment

      Picard et al. propose a Facial Expression Pain Signature (FEPS) derived from functional magnetic resonance imaging (fMRI) data to predict facial expressions associated with painful heat stimulation. This important work advances our understanding of the brain mechanisms associated with facial expressions of pain. It provides solid evidence that facial expressions of pain contain information that is complementary to other pain-related brain processes. The work will be of broad interest to researchers from varied fields ranging from neurosciences to psychology and affective sciences.

    3. Reviewer #2 (Public Review):

      Summary.

      The objective of this study was to further our understanding of the brain mechanisms associated with facial expressions of pain. To achieve this, participants' facial expressions and brain activity were recorded while they received noxious heat stimulation. The authors then used a decoding approach to predict facial expressions from functional magnetic resonance imaging (fMRI) data. They found a distinctive brain signature for pain facial expressions (FEPS). This signature had minimal overlap with brain signatures reflecting other components of pain phenomenology, such as signatures reflecting subjective pain intensity or negative effects.

      Strength.

      The authors used a rigorous approach involving multivariate brain decoding to predict the occurrence and intensity of pain facial expressions during noxious heat stimulation. The analyses are solid and well-conducted. This is an important study of fundamental and clinical relevance.

      Weakness.

      Despite those major strengths, the main weakness of the study is that the design and analyses do not allow us to know if the FEPS is really specific to pain expressions. Based on the analysis, it is possible to conclude that this brain signature is present when a participant is in a state of pain and displays a facial expression. However, it is possible that it would also be present when a participant experiences (another) negative state and displays (another) facial expression. It will be important, in future work, to investigate the specificity of this brain signature.

    4. Reviewer #3 (Public Review):

      In this manuscript, Picard et al. propose a Facial Expression Pain Signature (FEPS) as a distinctive marker of pain processing in the brain. Specifically, they attempt to use functional magnetic resonance imaging (fMRI) data to predict facial expressions associated with painful heat stimulation.

      The main strengths of the manuscript are that it is built on an extensive foundation of work from the research group, and that experience can be observed in the analysis of fMRI data and the development of the machine learning model. Additionally, it provides a comparative account of the similarities of the FEPS with other proposed pain signatures. The main weaknesses of the manuscript are the absence of a proper control condition to assess the specificity of the facial pain expressions, as well as several limitations in the experimental setup.

      I believe that the authors partially succeed in their aims, as described in the introduction, which are to assess the association between pain facial expression and existing pain-relevant brain signatures, and to develop a predictive brain activation model of the facial responses to painful thermal stimulation. However, they list several limitations in the study that should be addressed in future research in order to establish whether FEPS truly conveys distinctive information about the brain response to nociceptive stimuli.

    1. eLife assessment

      This paper provides important insights into the role of rice OsNF-YB7, an ortholog of Arabidopsis LEC1, in chlorophyll biosynthesis, uncovering the genetic and molecular basis for negative regulation of chlorophyll production in the rice embryo. Mutational analysis, gene expression profiles and protein interaction combine for convincing evidence that OsNF-YB7 represses chlorophyll biosynthesis.

    2. Reviewer #1 (Public Review):

      Summary:

      This manuscript investigates the regulation of chlorophyll biosynthesis in rice embryos, focusing on the role of OsNF-YB7. The rigorous experimental approach, combining genetic, biochemical, and molecular analyses, provides a robust foundation for these findings. The research achieves its objectives, offering new insights into chlorophyll biosynthesis regulation, with the results convincingly supporting the authors' conclusions.

      Strengths:

      The major strengths include the detailed experimental design and the findings regarding OsNF-YB7's inhibitory role.

      Weaknesses:

      However, the manuscript's discussion on the practical implications for agriculture and the evolutionary analysis of regulatory mechanisms could be expanded.

    3. Reviewer #2 (Public Review):

      Summary:

      The authors set out to establish the role of the rice LEC1 homolog OsNF-YB7 in embryo development, especially as it pertains to the development of photosynthetic capacity, with chlorophyll production as a primary focus.

      Strengths:

      The results are well-supported and each approach used complements each other. There are no major questions left unanswered and the central hypothesis is addressed in every figure.

      Weaknesses:

      There are a handful of sections which could use clarifying for readers, but overall this is a solidly composed manuscript.

      The authors clearly achieved their aims; the results compellingly establish a disparity between how this system operates in rice and Arabidopsis. Conclusions are thoroughly supported by the provided data and interpretations. This work will force a reconsideration of the value of Arabidopsis as a model organism for embryo chlorophyll biosynthesis and possibly photosynthesis during embryo maturation more broadly, as rice is a major crop organism and it very clearly does not follow the Arabidopsis model. It will thus be useful to carry out similar tests in other organisms rather than relying on Arabidopsis and attempt to more fully establish the regulatory mechanism in rice.

    4. Reviewer #3 (Public Review):

      Summary:

      In this study, the authors set out to understand the mechanisms behind chlorophyll biosynthesis in rice, focusing in particular on the role of OsNF-YB7, an ortholog of Arabidopsis LEC1, which is a positive regulator of chlorophyll (Chl) biosynthesis in Arabidopsis. They showed that OsNF-YB7 loss-of-function mutants in rice have chlorophyll-rich embryos, in contrast to Arabidopsis LEC1 loss-of-function mutants. This contrasting phenotype led the authors to carry out extensive molecular studies on OsNF-YB7, including in vitro and in vivo protein interaction studies, gene expression profiling and protein-DNA interaction assays. The evidence provided well supported the core arguments of the authors, emphasising that OsNF-YB7 is a negative regulator of Chl biosynthesis in rice embryos by mediating the expression of OsGLK1, a transcription factor that regulates downstream Chl biosynthesis genes. In addition, they showed that OsNF-YB7 interacts with OsGLK1 to negatively regulate the expression of OsGLK1, demonstrating the broad involvement of OsNF-YB7 in rice Chl biosynthetic pathways.

      Strengths:

      This study clearly demonstrated how OsNF-YB7 regulates its downstream pathways using several in vitro and in vivo approaches. For example, gene expression analysis of OsNF-YB7 loss-of-function and gain-of-function mutants revealed the expression of selected downstream chl biosynthetic genes. This was further validated by EMSA on the gel. The authors also confirmed this using luciferase assays in rice protoplasts. These approaches were used again to show how the interaction of OsNF-YB7 and OsGLK1 regulates downstream genes. The main idea of this study is very well supported by the results and data.

      Weaknesses:

      It would be interesting to see how two similar genes have come to play opposite roles in Arabidopsis and rice. Interspecies complementation might help to understand this point.