10,000 Matching Annotations
  1. Jun 2024
    1. Reviewer #1 (Public Review):

      Summary:

      In this manuscript, the authors intended to prove that gut GLP-1 expression and secretion can be regulated by Piezo1, and hence by mechanistic/stretching regulation. For this purpose, they have assessed Piezo1 expression in STC-1 cell line (a mouse GLP-1 producing cell line) and mouse gut, showing the correlation between Piezo1 level and Gcg levels (Figure S1). They then aimed to generate gut L cell-specific Piezo1 KO mice, and claimed the mice show impaired glucose tolerance and GLP-1 production, which can be mitigated by Ex-4 treatment (Figures 1-2). Pharmacological agents (Yoda1 and GsMTx4) and mechanic activation (intestinal bead implantation) were then utilized to prove the existence of ileal Piezo1-regulated GLP-1 synthesis (Figure 3). This was followed by testing such mechanism in a limited amount of primary L cells and mainly in the STC-1 cell line (Figures 4-7).

      While the novelty of the study is somehow appreciable, the bio-medical significance is not well demonstrated in the manuscript. The authors stated (in lines between lines 78-83) a number of potential side effects of GLP-1 analogs, how can the mechanistic study of GLP-1 production on its own be essential for the development of new drug targets for the treatment of diabetes. Furthermore, the study does not provide a clear mechanistic insight on how the claimed CaMKKbeta/CaMKIV-mTORC1 signaling pathway upregulated both GLP-1 production and secretion. This reviewer also has concerns about the experimental design and data presented in the current manuscript, including the issue of how proglucagon expression can be assessed by Western blotting.

      Strengths:

      The novelty of the concept.

      Weaknesses:

      Experimental design and key experiment information.

    2. Reviewer #2 (Public Review):

      Summary:

      The study by Huang and colleagues focuses on GLP-1 producing entero-endocrine (EEC) L-cells and their regulation of GLP-1 production by a mechano-gated ion channel Piezo1. The study describes Piezo1 expression by L-cells and uses an exciting intersectional mouse model (villin to target epithelium and Gcg to target GLP-1-producing cells and others like glucagon-producing pancreatic endocrine cells), which allows L-cell specific Piezo1 knockout. Using this model, they find an impairment of glucose tolerance, increased body weight, reduced GLP-1 content, and changes to the CaMKKbeta-CaMKIV-mTORC1 signaling pathway using a normal diet and then high-fat diet. Piezo1 chemical agonist and intestinal bead implantation reversed these changes and improved the disrupted phenotype. Using primary sorted L-cells and cell model STC-1, they found that stretch and Piezo1 activation increased GLP-1 and altered the molecular changes described above.

      Strengths:

      This is an interesting study testing a novel hypothesis that may have important mechanistic and translational implications. The authors generated an important intersectional genetics mouse model that allowed them to target Piezo1 L-cells specifically, and the surprising result of impaired metabolism is intriguing.

      Weaknesses:

      However, there are several critical limitations that require resolution before making the conclusions that the authors make.

      (1) A potential explanation for the data, and one that is consistent with existing literature [see for example, PMC5334365, PMC4593481], is that epithelial Piezo1, which is broadly expressed by the GI epithelium, impacts epithelial cell density and survival, and as such, if Piezo1 is involved in L-cell physiology, it may be through regulation of cell density. Thus, it is critical to determine L-cell densities and epithelial integrity in controls and Piezo1 knockouts systematically across the length of the gut, since the authors do not make it clear which gut region contributes to the phenotype they see. Current immunohistochemistry data are not convincing.

      (2) Calcium signaling in L-cells is implicated in their typical role of being gut chemo-sensors, and Piezo1 is a calcium channel, so it is not clear whether any calcium-related signaling mechanism would phenocopy these results.

      (3) Intestinal bead implantation, while intriguing, does not have clear mechanisms - and is likely to provide a point of intestinal obstruction and dysmotility.

      (4) Previous studies, some that are very important, but not cited, contradict the presented results (e.g., epithelial Piezo1 role in insulin secretion) and require reconciliation.

      Overall, this study makes an interesting observation but the data are not currently strong enough to support the conclusions.

    3. Author response:

      Reviewer #1 (Public Review):

      Summary:

      In this manuscript, the authors intended to prove that gut GLP-1 expression and secretion can be regulated by Piezo1, and hence by mechanistic/stretching regulation. For this purpose, they have assessed Piezo1 expression in STC-1 cell line (a mouse GLP-1 producing cell line) and mouse gut, showing the correlation between Piezo1 level and Gcg levels (Figure S1). They then aimed to generate gut L cell-specific Piezo1 KO mice, and claimed the mice show impaired glucose tolerance and GLP-1 production, which can be mitigated by Ex-4 treatment (Figures 1-2). Pharmacological agents (Yoda1 and GsMTx4) and mechanic activation (intestinal bead implantation) were then utilized to prove the existence of ileal Piezo1-regulated GLP-1 synthesis (Figure 3). This was followed by testing such mechanism in a limited amount of primary L cells and mainly in the STC-1 cell line (Figures 4-7).

      While the novelty of the study is somehow appreciable, the bio-medical significance is not well demonstrated in the manuscript. The authors stated (in lines between lines 78-83) a number of potential side effects of GLP-1 analogs, how can the mechanistic study of GLP-1 production on its own be essential for the development of new drug targets for the treatment of diabetes. Furthermore, the study does not provide a clear mechanistic insight on how the claimed CaMKKbeta/CaMKIV-mTORC1 signaling pathway upregulated both GLP-1 production and secretion. This reviewer also has concerns about the experimental design and data presented in the current manuscript, including the issue of how proglucagon expression can be assessed by Western blotting.

      Strengths:

      The novelty of the concept.

      Weaknesses:

      Experimental design and key experiment information.

      Current GLP-1-based therapies for diabetes use GLP-1 agonists/analogs. Although generally safe, there are some side effect or risks of GLP-1 agonists/analogs. We agree to the reviewer that a mechanistic study on the regulation of GLP-1 production will not directly lead to development of new drug targets for the treatment of diabetes. However, understanding the mechanism of GLP-1 production may shed light onto alternative treatment strategies for diabetes that targeting the production of GLP-1. In our previous studies, we have elucidated the role of mTOR/S6K pathway in regulating GLP-1 production in L cells. Using STC-1 cell line and different mouse models, including Neurog3-Tsc1−/− mice, rapamycin or L-lucine treatment to stimulate mTOR activity, we have demonstrated that mTOR stimulates proglucagon gene expression and thus GLP-1 production (Diabetologia 2015;58(8):1887-97; Mol Cell Endocrinol. 2015 Nov 15:416:9-18.). Based on our previous studies, we found that Piezo1 regulated mTOR/S6K pathway and thus proglucagon expression and GLP-1 production through Ca2+/CaMKKbeta/CaMKIV in our present study. Although we could not exclude involvement of other signaling pathways downstream of Piezo1 in regulating the cleavage of proglucagon, granule maturation and the final release of GLP-1, our present study provided evidence to support the involvement of the Ca2+/CaMKKbeta/CaMKIV/mTOR pathway in mediating the role Piezo1 in proglucagon expression and GLP-1 production. The reviewer also expressed concerns on the use of western blot to detect proglucagon expression. In fact, western blot is often used in detection of proglucagon. Here are some examples from other researchers: Diabetes. 2013 Mar;62(3):789-800. Gastroenterology. 2011 May;140(5):1564-74. 2004 Jul 23;279(30):31068-75. The proglucagon antibody we used in our study was purchased from abcam (Cat#ab23468), which can detect proglucagon of 21 kDa.

      Reviewer #2 (Public Review):

      Summary:

      The study by Huang and colleagues focuses on GLP-1 producing entero-endocrine (EEC) L-cells and their regulation of GLP-1 production by a mechano-gated ion channel Piezo1. The study describes Piezo1 expression by L-cells and uses an exciting intersectional mouse model (villin to target epithelium and Gcg to target GLP-1-producing cells and others like glucagon-producing pancreatic endocrine cells), which allows L-cell specific Piezo1 knockout. Using this model, they find an impairment of glucose tolerance, increased body weight, reduced GLP-1 content, and changes to the CaMKKbeta-CaMKIV-mTORC1 signaling pathway using a normal diet and then high-fat diet. Piezo1 chemical agonist and intestinal bead implantation reversed these changes and improved the disrupted phenotype. Using primary sorted L-cells and cell model STC-1, they found that stretch and Piezo1 activation increased GLP-1 and altered the molecular changes described above.

      Strengths:

      This is an interesting study testing a novel hypothesis that may have important mechanistic and translational implications. The authors generated an important intersectional genetics mouse model that allowed them to target Piezo1 L-cells specifically, and the surprising result of impaired metabolism is intriguing.

      Weaknesses:

      However, there are several critical limitations that require resolution before making the conclusions that the authors make.

      (1) A potential explanation for the data, and one that is consistent with existing literature [see for example, PMC5334365, PMC4593481], is that epithelial Piezo1, which is broadly expressed by the GI epithelium, impacts epithelial cell density and survival, and as such, if Piezo1 is involved in L-cell physiology, it may be through regulation of cell density. Thus, it is critical to determine L-cell densities and epithelial integrity in controls and Piezo1 knockouts systematically across the length of the gut, since the authors do not make it clear which gut region contributes to the phenotype they see. Current immunohistochemistry data are not convincing.

      We appreciate the reviewer’s comment. We agree that Piezo1 may affect L-cell density and epithelial integrity. We will do quantification of L-cell density and test the epithelial integrity by examining the expression of tight junction proteins (ZO-1 and Occludin) and determine the transepithelial resistance in different regions of the gut

      (2) Calcium signaling in L-cells is implicated in their typical role of being gut chemo-sensors, and Piezo1 is a calcium channel, so it is not clear whether any calcium-related signaling mechanism would phenocopy these results.

      We will examine whether other calcium-related signaling mechanism also contribute the phenotype seen in the IntL-Piezo1-/- mice.

      (3) Intestinal bead implantation, while intriguing, does not have clear mechanisms - and is likely to provide a point of intestinal obstruction and dysmotility.

      To ascertain if intestinal bead implantation led to intestinal obstruction and dysmotility, we conducted a bowel transit time test. The results revealed no difference in bowel transit time between the sham-operated mice and those implanted with beads.

      (4) Previous studies, some that are very important, but not cited, contradict the presented results (e.g., epithelial Piezo1 role in insulin secretion) and require reconciliation.

      Overall, this study makes an interesting observation but the data are not currently strong enough to support the conclusions.

      We will cite more previous studies on GLP-1 production and discuss the discrepancy between our study and others’ studies. The lack of changes in blood glucose seen in Villin-Piezo1-/- mice reported by Sugisawa et. al. is not surprising (Cell. 2020 Aug 6;182(3):609-624.e21.). Actually, in another recent study from our group, we found similar results when the Villin-Piezo1-/- mice Piezo1fl/fl control mice were fed with normal chow diet. Since Villin-1 is expressed in all the epithelial cells of the gut, including enterocytes and various types of endocrine cells, the effect of L-cell Piezo1 loss may be masked by other cell types under normal condition. However, impair glucose tolerance was seen in Villin-Piezo1-/- mice compared to the Piezo1fl/fl control mice after high fat diet for 8 weeks. We further found that Piezo1 in enterocytes exerted a negative effect on the glucose and lipid absorption. Loss of Piezo1 in enterocytes led to over-absorption of nutrients under high-fat diet (Tian Tao, Qing Shu, Yawen Zhao, Wenying Guo, Jinting Wang, Yuhao Shi, Shiqi Jia, Hening Zhai, Hui Chen, Cunchuan Wang*, Geyang Xu*, Mechanical regulation of lipid and sugar absorption by Piezo1 in enterocytes, Acta Pharmaceutica Sinica B , Accepted, 2024,https://doi.org/10.1016/j.apsb.2024.04.016).

    1. Author response:

      The following is the authors’ response to the original reviews.

      Your editorial guidance, reviews, and suggestions have led us to make substantial changes to our manuscript. While we detail point-by-point responses in typical fashion below, I wanted to outline, at a high level, what we’ve done.

      (1) Methods. Your suggestions led us to rethink our presentation of our methods, which are now described more cohesively in a new methods section in the main text.

      (2) Model Validation & Robustness. Reviewers suggested various validations and checks to ensure that our findings were not, for instance, the consequence of a particular choice of parameter. These can be found in the supplementary materials.

      (3) Data Cleaning & Inclusion/Exclusion. Finally, based on feedback, our new methods section fully describes the process by which we cleaned our original data, and on what grounds we included/excluded individual faculty records from analysis.

      eLife assessment

      Efforts to increase the representation of women in academia have focussed on efforts to recruit more women and to reduce the attrition of women. This study - which is based on analyses of data on more than 250,000 tenured and tenure-track faculty from the period 2011-2020, and the predictions of counterfactual models - shows that hiring more women has a bigger impact than reducing attrition. The study is an important contribution to work on gender representation in academia, and while the evidence in support of the findings is solid, the description of the methods used is in need of improvement.

      Reviewer #1 (Public Review):

      Summary and strengths

      This is an interesting paper that concludes that hiring more women will do more to improve the gender balance of (US) academia than improving the attrition rates of women (which are usually higher than men's). Other groups have reported similar findings but this study uses a larger than usual dataset that spans many fields and institutions, so it is a good contribution to the field.

      We thank the reviewer for their positive assessment of the contributions of our work.

      Weaknesses

      The paper uses a mixture of mathematical models (basically Leslie matrices, though that term isn't mentioned here) parameterised using statistical models fitted to data. However, the description of the methods needs to be improved significantly. The author should consider citing Matrix Population Models by Caswell (Second Edition; 2006; OUP) as a general introduction to these methods, and consider citing some or all of the following as examples of similar studies performed with these models:

      Shaw and Stanton. 2012. Proc Roy Soc B 279:3736-3741

      Brower and James. 2020. PLOS One 15:e0226392

      James and Brower. 2022. Royal Society Open Science 9:220785 Lawrence and Chen. 2015.

      [http://128.97.186.17/index.php/pwp/article/view/PWP-CCPR-2015-008]

      Danell and Hjerm. 2013. Scientometrics 94:999-1006

      We have expanded the description of methods in a new methods section of the paper which we hope will address the reviewer’s concerns.

      We agree that our model of faculty hiring and attrition resembles Leslie matrices. In results section B, we now mention Leslie matrices and cite Matrix Population Models by Caswell, noting a few key differences between Leslie matrices and the model of hiring and attrition presented in this work. Most notably, in the hiring and attrition model presented, the number of new hires is not based on per-capita fertility constants. Instead, population sizes are predetermined fixed values for each year, precluding exponential population growth or decay towards 0 that is commonly observed in the asymptotic behavior of linear Leslie Matrix models.

      We have additionally revised the main text to cite the listed examples of similar studies (we had already cited James and Brower, 2022). We thank the reviewer for bringing these relevant works to our attention.

      The analysis also runs the risk of conflating the fraction of women in a field with gender diversity! In female-dominated fields (e.g. Nursing, Education) increasing the proportion of women in the field will lead to reduced gender diversity. This does not seem to be accounted for in the analysis. It would also be helpful to state the number of men and women in each of the 111 fields in the study.

      We have carefully examined the manuscript and revised the text to correctly differentiate between gender diversity and women’s representation.

      We have additionally added a table to the supplemental materials (Tab. S3) that reports the estimated number of men and women in each of the 111 fields.

      Reviewer #2 (Public Review):

      Summary:

      This important study by LaBerge and co-authors seeks to understand the causal drivers of faculty gender demographics by quantifying the relative importance of faculty hiring and attrition across fields. They leverage historical data to describe past trends and develop models that project future scenarios that test the efficacy of targeted interventions. Overall, I found this study to be a compelling and important analysis of gendered hiring and attrition in US institutions, and one that has wide-reaching policy implications for the academy. The authors have also suggested a number of fruitful future avenues for research that will allow for additional clarity in understanding the gendered, racial, and socioeconomic disparities present in US hiring and attrition, and potential strategies for mitigating or eliminating these disparities.

      We thank the reviewer for their positive assessment of the contributions of our work.

      Strengths:

      In this study, LaBerge et al use data from over 268,000 tenured and tenure-track faculty from over 100 fields at more than 12,000 PhD-granting institutions in the US. The period they examine covers 2011-2020. Their analysis provides a large-scale overview of demographics across fields, a unique strength that allows the authors to find statistically significant effects for gendered attrition and hiring across broad areas (STEM, non-STEM, and topical domains).

      LaBerge et al. find gendered disparities in attrition-using both empirical data and their counterfactual model-that account for the loss of 1378 women faculty across all fields between 2011 and 2020. It is true that "this number is both a small portion of academia... and a staggering number of individual careers," as ." - as this loss of women faculty is comparable to losing more than 70 entire departments. I appreciate the authors' discussion about these losses-they note that each of these is likely unnecessary, as women often report feeling that they were pushed out of academic jobs.

      LaBerge et al. also find-by developing a number of model scenarios testing the impacts of hiring, attrition, or both-that hiring has a greater impact on women's representation in the majority of academic fields in spite of higher attrition rates for women faculty relative to men at every career stage. Unlike many other studies of historical trends in gender diversity, which have often been limited to institution-specific analyses, they provide an analysis that spans over 100 fields and includes nearly all US PhD-granting institutions. They are able to project the impacts of strategies focusing on hiring or retention using models that project the impact of altering attrition risk or hiring success for women. With this approach, they show that even relatively modest annual changes in hiring accumulate over time to help improve the diversity of a given field. They also demonstrate that, across the model scenarios they employ, changes to hiring drive the largest improvement in the long-term gender diversity of a field.

      Future work will hopefully - as the authors point out - include intersectional analyses to determine whether a disproportionate share of lost gender diversity is due to the loss of women of color from the professoriate. I appreciate the author's discussion of the racial demographics of women in the professoriate, and their note that "the majority of women faculty in the US are white" and thus that the patterns observed in this study are predominately driven by this demographic. I also highly appreciate their final note that "equal representation is not equivalent to equal or fair treatment," and that diversifying hiring without mitigating the underlying cause of inequity will continue to contribute to higher losses of women faculty.

      Weaknesses

      First, and perhaps most importantly, it would be beneficial to include a distinct methods section. While the authors have woven the methods into the results section, I found that I needed to dig to find the answers to my questions about methods. I would also have appreciated additional information within the main text on the source of the data, specifics about its collection, inclusion and exclusion criteria for the present study, and other information on how the final dataset was produced. This - and additional information as the authors and editor see fit - would be helpful to readers hoping to understand some of the nuance behind the collection, curation, and analysis of this important dataset.

      We have expanded upon the description of methods in a new methods section of the paper.

      We have also added a detailed description of the data cleaning steps taken to produce the dataset used in these analyses, including the inclusion/exclusion criteria applied. This detailed description is at the beginning of the methods section. This addition has substantially enhanced the transparency of our data cleaning methods, so we thank the reviewer for this suggestion.

      I would also encourage the authors to include a note about binary gender classifications in the discussion section. In particular, I encourage them to include an explicit acknowledgement that the trends assessed in the present study are focused solely on two binary genders - and do not include an analysis of nonbinary, genderqueer, or other "third gender" individuals. While this is likely because of the limitations of the dataset utilized, the focus of this study on binary genders means that it does not reflect the true diversity of gender identities represented within the professoriate.

      In a similar vein, additional context on how gender was assigned on the basis of names should be added to the methods section.

      We use a free, open-source, and open-data python package called nomquamgender (Van Buskirk et al, 2023) to estimate the strengths of (culturally constructed) name-gender associations. For sufficiently strong associations with a binary gender, we apply those labels to the names in our data. We have updated the main text to make this approach more apparent.

      We have also added language to the main text which explicitly acknowledges that our approach only assigns binary (woman/man) labels to faculty. We point out that this is a compromise due to the technical limitations of name-based gender methodologies and is not intended to reinforce a gender binary.

      I do think that some care might be warranted regarding the statement that "eliminating gendered attrition leads to only modest changes in field-level diversity" (Page 6). while I do not think that this is untrue, I do think that the model scenarios where hiring is "radical" and attrition is unchanged from present (equal representation of women and men among hires (ER) + observed attrition (OA)) shows that a sole focus on hiring dampens the gains that can otherwise be addressed via even modest interventions (see, e.g., gender-neutral attrition (GNA) + increasing representation of women among hires (IR)). I am curious as to why the authors did not include an additional scenario where hiring rates are equal and attrition is equalized (i.e., GNA + ER). The importance of including this additional model is highlighted in the discussion, where, on Page 7, the authors write: "In our forecasting analysis, we find that eliminating the gendered attrition gap, in isolation, would not substantially increase representation of women faculty in academia. Rather, progress towards gender parity depends far more heavily on increasing women's representation among new faculty hires, with the greatest change occurring if hiring is close to gender parity." I believe that this statement would be greatly strengthened if the authors can also include a comparison to a scenario where both hiring and attrition are addressed with "radical" interventions.

      Our rationale for omitting the GNA + ER scenario in the presented analysis is that we can reason about the outcomes of this scenario without the need for computation; if a field has equal inputs of women and men faculty (on average) and equal retention rates between women and men (on average), then, no matter the field’s initial age and gender distribution of faculty, the expected value for the percentage of women faculty after all of the prior faculty have retired (which may take 40+ years) is exactly 50%. We have updated the main text to discuss this point.

      Reviewer #3 (Public Review):

      This manuscript investigates the roles of faculty hiring and attrition in influencing gender representation in US academia. It uses a comprehensive dataset covering tenured and tenure-track faculty across various fields from 2011 to 2020. The study employs a counterfactual model to assess the impact of hypothetical gender-neutral attrition and projects future gender representation under different policy scenarios. The analysis reveals that hiring has a more significant impact on women's representation than attrition in most fields and highlights the need for sustained changes in hiring practices to achieve gender parity.

      Strengths:

      Overall, the manuscript offers significant contributions to understanding gender diversity in academia through its rigorous data analysis and innovative methodology.

      The methodology is robust, employing extensive data covering a wide range of academic fields and institutions.

      Weaknesses:

      The primary weakness of the study lies in its focus on US academia, which may limit the generalizability of its findings to other cultural and academic contexts.

      We agree that the U.S. focus of this study limits the generalizability of our findings. The findings that we present in this work will only generalize to other populations–whether it be to an alternate industry, e.g., tech workers, or to faculty in different countries–to the extent that these other populations share similar hiring patterns, retention patterns, and current demographic representation. We have added a discussion of this limitation to the manuscript.

      Additionally, the counterfactual model's reliance on specific assumptions about gender-neutral attrition could affect the accuracy of its projections.

      Our projection analysis is intended to illustrate the potential gender representation outcomes of several possible counterfactual scenarios, with each projection being conditioned on transparent and simple assumptions. In this way, the projection analysis is not intended to predict or forecast the future.

      To resolve this point for our readers, we now introduce our projections in the context of the related terms of prediction and forecast, noting that they have distinct meanings as terms of art: On one hand, prediction and forecasting involve anticipating a specific outcome based on available information and analysis, and typically rely on patterns, trends, or historical data to make educated guesses about what will happen. Projections are based on assumptions and are often presented in a panel of possible future scenarios. While predictions and forecasts aim for precision, projections (which we make in our analysis) are more generalized and may involve a range of potential outcomes.

      Additionally, the study assumes that whoever disappeared from the dataset is attrition in academia. While in reality, those attritions could be researchers who moved to another country or another institution that is not included in the AARC (Academic Analytics Research Centre) dataset.

      In our revision, we have elevated this important point, and clarified it in the context of the various ways in which we count hires and attritions. We now explicitly state that “We define faculty hiring and faculty attrition to include all cases in which faculty join or leave a field or domain within our dataset.” Then, we enumerate the number of situations that could be counted as hires and attritions, including the reviewer’s example of faculty who move to another country.

      Reviewer #1 (Recommendations For The Authors):

      Section B: The authors use an age structured Leslie matrix model (see Caswell for a good reference to these) to test the effect of making the attrition rates or hiring rates equal for men and women. My main concern here is the fitting techniques for the parameters. These are described (a little too!) briefly in section S1B. Some specific questions that are left hanging include:

      A 5th order polynomial is an interesting choice. Some statistical evidence as to why it was the best fit would be useful. What other candidate models were compared? What was the "best fit" judgement made with: AIC, r^2? What are the estimates for how good this fit is? How many data points were fitted to? Was it the best fit choice for all of the 111 fields for men and women?

      We use a logistic regression model for each field to infer faculty attrition probabilities across career ages and time, and we include the career age predictor up to its fifth power to capture the career-age correlations observed in Spoon et. al., Science Advances, 2023. For ease of reference, we reproduce the attrition risk curves in Fig S4.

      We note that faculty attrition rates start low and then reach a peak around 5-7 years after earning PhD, and then decline until around 15-20 years post-PhD, after which, attrition rates increase as faculty approach retirement.

      This function shape starts low and ends high, and includes at least one local minimum, which indicates that career age should be odd-ordered in the model and at least order-3, but only including career age up to its 3rd order term tended to miss some of the overserved career-age/attrition correlations. We evaluated the fit using 5-fold cross validation with a Brier score loss metric, and among options of polynomials of degree 1, 3, 5, or 7, we found that 5th order performed well overall on average over all fields (even if it was not the best for every field), without overfitting in fields with fewer data. Example fits, reminiscent of the figure from Spoon et al, are now provided in Figs S4 and S5.

      While the model fit with fifth order terms may not be the best fit for all 111 fields (e.g., 7th order fits better in some cases), we wanted to avoid field-specific curves that might be overfitted to the field-specific data, especially due to low sample size (and thus larger fluctuations) on the high career age side of the function. Our main text and supplement now includes justifications for our choice to include career age up to its fifth order terms.

      You used the 5th order logistic regression (bottom of page 11) to model attrition at different ages. The data in [24] shows that attrition increases sharply, then drops then increases again with career age. A fifth order polynomial on its own could plausibly do this but I associate logistic regression models like this as being monotonically increasing (or decreasing!), again more details as to how this worked would be useful.

      Our first submission did not explain this point well, but we hope that Supplementary Figures S4 and S5 provide clarity. In short, we agree of course that typical logistic regression assumes a linear relationship between the predictor variables and the log odds of the outcome variable. This means that the relationship between the predictor variables and the probability of the outcome variable follows a sigmoidal (S-shaped) curve. However, the relationship between the predictor variables and the outcome variable may not be linear.

      To capture more complex relationships, like the increasing, decreasing and then increasing attrition rates as a function of career age, higher-order terms can be added to the logistic regression model. These higher-order terms allow the model to capture nonlinear relationships between the predictor variables and the outcome variable — namely the non-monotonic relationship between rates of attrition and career age — while staying within a logistic regression framework.

      "The career age of new hires follows the average career age distribution of hires" did you use the empirical distribution here or did you fit a standard statistical distribution e.g. Gamma?

      We used the empirical distribution. This information has been added to the updated methods section in the main text.

      How did you account for institution (presumably available)? Your own work has shown that institution types plays a role which could be contributing to these results.

      See below.

      What other confounding variables could be at play here, what is available as part of the data and what happens if you do/don't account for them?

      A number of variables included in our data have been shown to correlate with faculty attrition, including PhD prestige, current institution prestige, PhD country, and whether or not an individual is a “self-hire,” i.e., trained and hired at the same institution (Wapman et. al., Nature, 2022). Additional factors that faculty self-report as reasons for leaving academia include issues of work-life balance, workplace climate, and professional reasons, and in some cases to varying degrees between men and women faculty (Spoon et. al., Sci. Adv., 2023).

      Our counterfactual analysis aims to address a specific question: how would women’s representation among faculty be different today if men and women were subjected to the same attrition patterns over the past decade? To answer this question, it is important to account for faculty career age, which we accept as a variable that will always correlate strongly with faculty attrition rates, as long as the tenure filter remains in place and faculty continue to naturally progress towards retirement age. On the other hand, it is less clear why PhD country, self-hire status, or any of the other mentioned variables should necessarily correlate with attrition rates and with gendered differences in attrition rates more specifically. While some or all of these variables may underlie the causal roots of gendered attrition rates, our analysis does not seek to answer causal questions about why faculty leave their jobs (e.g., by testing the impact of accounting for these variables in simulations per the reviewers suggestion). This is because we do not believe the data used in this analysis is sufficient to answer such questions, lacking comprehensive data on faculty stress (Spoon et. al., Sci. Adv., 2023), parenthood status, etc.

      What career age range did the model use?

      The career age range observed in model outcomes are a function of the empirically derived attrition rates for faculty across academic fields. The highest career age observed in the AARC data was 80, and the faculty career ages that result from our model simulations and projections do not exceed 80.

      We have also added the distribution of faculty across career ages for the projection scenario model outputs in the supplemental materials Fig. S3 (see response to your later comment regarding career age for further details). Looking at these distributions, it is observed that very few faculty have career age > 60, both in observation and in our simulations.

      What was the initial condition for the model?

      Empirical 2011 Faculty rosters are used as the initial conditions for the counterfactual analysis, and 2020 faculty rosters are these as the initial conditions for the projections analysis. This information has been added to the descriptions of methods in the main text.

      Starting the model in 2011 how well does it fit the available data up to 2020?

      Thank you for this suggestion. We ran this analysis for each field starting in 2011, and found that model outcomes were statistically indistinguishable from the observed 2020 faculty gender compositions for all 111 academic fields. This finding is not surprising, because the model is fit to the observed data, but it serves to validate the methods that we used to extract the model's parameters. We have added these results to the supplement (Fig. S2).

      What are the sensitivity analysis results for the model? If you have made different fitting decisions how much would the results change? All this applied to both the hiring and attrition parameters estimates.

      We model attrition and hiring using logistic regression, with career age included as an exogenous variable up to its fifth power. A natural question follows: what if we used a model with career age only to its first or third power? Or to higher powers? We performed this sensitivity analysis, and added three new figures to the supplement to present these findings:

      First, we show the observed attrition probabilities at each career age, and four model fits to attrition data (Supplementary Figs S4 and S5). The first model includes career age only to its first power, and this model clearly does not capture the full career age / attrition correlation structure. The second model includes career age to its third power, which does a better job of fitting to the observed patterns. The third model includes career age up to its fifth power, which appears to very modestly improve upon the former model. The fourth model includes career age up to its seventh power, and the patterns captured by this model are largely the same as the 5th-power model up to career age 50, beyond which there are some notable differences in the inferred attrition probabilities. These differences would have relatively little impact on model outcomes because the vast majority of faculty have a career age below 50.

      Second, we show the observed probability that hires are women, conditional on the career age of the hire. Once again, we fit four models to the data, and find that career age should be included at least up to its fifth order in order to capture the correlation structures between career age and the gender of new hires. However, limited differences result from including career age up to the 7th degree in the model (relative to the 5th degree).

      As a final sensitivity analysis, we reproduce Fig. 2, but rather than including career age as an exogenous variable up to its fifth power in our models for hiring and attrition, we include career age up to its third power. Findings under this parameterization are qualitatively very similar to those presented in Fig. 2, indicating that the results are robust to modest changes to model parameterization (shown in supplement Fig. S6).

      Far more detail in this and some interim results from each stage of the analysis would make the paper far more convincing. It currently has an air of "black box" too much of the analysis which would easily allow an unconvinced reader to discard the results.

      We have added more detailed descriptions of the methods to the main text. We hope that the changes made will address these concerns.

      Section C: You use the Leslie model to predict the future population. As the model is linear the population will either grow exponentially (most likely) or dwindle to zero. You mention you dealt with this by scaling the average value of H to keep the population at 2020 levels? This would change the ratio of hiring to attrition. How did this affect the timescale of the results. If a field had very minimal attrition (and hence grew massively over the time period of the dataset) the hiring rate would have to be very small too so there would be very little change in the gender balance. Did you consider running the model to steady state instead?

      We chose the 40 year window (2020-2060) for this projection analysis because 40 years is roughly the timespan of a full-length faculty career. In other words, it will take around 40 years for most of the pre-existing faculty from 2020 to retire, such that the new, simulated faculty will have almost entirely replaced all former faculty by 2060.

      For three out of five of our projection scenarios (OA, GNA, OA+ER), the point at which observed faculty are replaced by simulated faculty represents steady state. One way to check this intuition is to observe the asymptotic behavior of the trajectories in Fig. 3B; the slopes for these 3 scenarios nearly level out within 40 years.

      The other two scenarios (OA + IR, GNA+IR) represent situations where women’s representation among new hires is increasing each year. These scenarios will not reach steady state until women represent 100% of faculty. Accordingly, the steady state outcomes for these scenarios would yield uninteresting results; instead, we argue that it is the relative timescales that are interesting.

      What did you do to check that your predictions at least felt realistic under the fitted parameters? (see above for presenting the goodness of fit over the 10 years of the data).

      We ran the analysis suggested in a prior comment (Starting the model in 2011 how well does it fit the available data up to 2020?) and found that model outcomes were statistically indistinguishable from the observed 2020 faculty gender compositions for all 111 academic fields, plus the “All STEM” and “All non-STEM” aggregations.

      You only present the final proportion of women for each scenario. As mentioned earlier, models of this type have a tendency to lead to strange population distributions with wild age predictions and huge (or zero populations). Presenting more results here would assuage any worries the reader had about these problems. What is the predicted age distribution of men and women in the long term scenarios? Would a different method of keeping the total population in check have yielded different results? Interim results, especially from a model as complex as this one, rather than just presenting a final single number answer are a convincing validation that your model is a good one! Again, presenting this result will go a long way to convincing readers that your results are sound and rigorous.

      Thank you for this suggestion. We now include a figure that presents faculty age distributions for each projection scenario at 2060 against the observed faculty age distribution in 2020 (pictured below, and as Fig. S3 in the supplementary materials). We find that the projected age distributions are very similar to the observed distributions for natural sciences (shown) and for the additional academic domains. We hope this additional validation will inspire confidence in our model of faculty hiring and attrition for the reviewer, and for future readers.

      In Fig S3, line widths for the simulated scenarios span the central 95% of simulations.

      Other people have reached almost identical conclusions (albeit it with smaller data sets) that hiring is more important than attrition. It would be good to compare your conclusions with their work in the Discussion.

      We have revised the main text to cite the listed examples of similar studies. We thank the reviewer for bringing these relevant works to our attention.

      General comments:

      What thoughts have you given to non-binary individuals?

      Be careful how you use the term "gender diversity"! In many countries "Gender diverse" is a term used in data collection for non-binary individuals, i.e. Male, female, gender diverse. The phrase "hiring more gender diverse faculty" can be read in different ways! If you are only considering men and women then gender balance may be a better framework to use.

      We have added language to the main text which explicitly acknowledges that our analysis focuses on men and women due to limitations in our name-based gender tool, which only assigns binary (woman/man) labels to faculty. We point out that this is a compromise due to the technical limitations of name-based gender methodologies and is not intended to reinforce a gender binary.

      We have also taken additional care with referring to “gender diversity,” per reviewer 1’s point in their public review.

      Reviewer #2 (Recommendations For The Authors):

      Data availability: I did not see an indication that the dataset used here is publicly available, either in its raw format or as a summary dataset. Perhaps this is due to the sensitive nature of the data, but regardless of the underlying reason, the authors should include a note on data availability in the paper.

      The dataset used for these analyses were obtained under a data use agreement with the Academic Analytics Research Center (AARC). While these data are not publicly available, researchers may apply for data access here: https://aarcresearch.com/access-our-data.

      We also added a table to the supplemental materials (Tab. S3) that reports the estimated number of men and women in each of the 111 fields.

      Additionally, a variety of summary statistics based on this dataset are available online, here: https://github.com/LarremoreLab/us-faculty-hiring-networks/tree/main

      Gender classification: Was an existing package used to classify gender from names in the dataset, or did the authors develop custom code to do so? Either way, this code should be cited. I would also be curious to know what the error rate of these classifications are, and suggest that additional information on potential biases that might result from automated classifications be included in the discussion, under the section describing data limitations. The reliability of name-based gender classification is particularly of interest, as external gender classifications such as those applied on the basis of an individual's name - may not reflect the gender with which an individual self-identifies. In other words, while for many people their names may reflect their true genders, for others those names may only reflect their gender assigned at birth and not their self-perceived or lived gender identity. Nonbinary faculty are in particular invisibilized here (and through any analysis that assigns binary gender on the basis of name). While these considerations do not detract from the main focus of the study - which was to utilize an existing dataset classified only on the basis of binary gender to assess trends for women faculty-these limitations should be addressed as they provide additional context for the interpretation of the results and suggest avenues for future research.

      We use a free, open-source, and open-data python package called nomquamgender (Van Buskirk et al, 2023) to estimate the strengths of (culturally constructed) name-gender associations. For sufficiently strong associations with a binary gender, we apply those labels to the names in our data. We have updated the main text to make this approach more apparent.

      We have also added language to the main text which explicitly acknowledges that our approach only assigns binary (woman/man) labels to faculty. We point out that this is a compromise due to the technical limitations of name-based gender methodologies and is not intended to reinforce a gender binary.

      As we mentioned in response to the public review, we use a free and open source python package called nomquamgender to estimate the strengths of name-gender associations, and we apply gender labels to the names with sufficiently strong associations with a binary gender. This package is based on a paper by Van Buskirk et. al. 2023, “An open-source cultural consensus approach to name-based gender classification,” which documents error rates and potential biases.

      We have also added language to the main text which explicitly acknowledges that our approach only assigns binary (woman/man) labels to faculty. We point out that this is a compromise due to the technical limitations of name-based gender methodologies and is not intended to reinforce a gender binary.

      Page 1: The sentence beginning "A trend towards greater women's representation could be caused..." is missing a conjunction. It should likely read: "A trend towards greater women's representation could be caused entirely by attrition, e.g., if relatively more men than women leave a field, OR entirely by hiring..."

      We have edited the paragraph to remove the sentence in question.

      Pages 1-2: The sentence beginning "Although both types of strategy..." and ending with "may ultimately achieve gender parity" is a bit of a run-on; perhaps it would be best to split this into multiple sentences for ease of reading.

      We have revised this run-on sentence.

      Page 2: See comments in the public review about a methods section, the addition of which may help to improve clarity for the readers. Within the existing descriptions of what I consider to be methods (i.e., the first three paragraphs currently under "results"), some minor corrections could be added here. First, consider citing the source of the dataset in the line where it is first described (in the sentence "For these analyses, we exploit a census-level dataset of employment and education records for tenured and tenure-track faculty in 12,112 PhD-granting departments in the United States from 2011-2020.") It also may be helpful to include context here (or above, in the discussion about institutional analyses) about how "departments" can be interpreted. For example, how many institutions are represented across these departments? More information on how the authors eliminated the gendered aspect of patterns in their counterfactual model would be helpful as well; this is currently hinted at on page 4, but could instead be included in the methods section with a call-out to the relevant supplemental information section (S2B).

      We have added a citation to Academic Analytics Research Center’s (AARC) list of available data elements to the data’s introduction sentence. We hope this will allow readers to familiarize themselves with the data used in our analysis.

      Faculty department membership was determined by AARC based on online faculty rosters. 392 institutions are represented across the 12,112 departments present in our dataset. We have updated the main text to include this information.

      Finally, we have added a methods section to the main text, which includes information on how the gendered aspect of attrition patterns were eliminated in the counterfactual model.

      Page 2: Perhaps some indication of how many transitions from an out-of-sample institution might be helpful to readers hoping to understand "edge cases."

      In our analysis, we consider all transitions from out-of-sample institutions to in-sample institutions as hires, and all transitions away from in-sample institutions–whether it be to an out of sample institution, or out of academia entirely–as attritions. We choose to restrict our analysis of hiring and attrition to PhD granting institutions in the U.S. in this way because our data do not support an analysis of other, out-of-sample institutions.

      I also would have liked additional information on how many faculty switched institutions but remained "in-sample and in the same field" - and the gender breakdowns of these institutional changes, as this might be an interesting future direction for studies of gender parity. (For example, readers may be spurred to ask: if the majority of those who move institutions are women, what are the implications for tenure and promotion for these individuals?)

      While these mid-career moves are not counted as attritions in the present analysis, a study of faculty who switch institutions but remain (in-sample) as faculty could shed light on issues of gendered faculty retention at the level of institutions. We share the reviewer’s interest in a more in depth study of mid-career moves and how these moves impact faculty careers, and we now discuss the potential value of such a study towards the end of the paper. In fact, this subject is the topic of a current investigation by the authors!

      Page 3: I was confused by the statement that "of the three types of stable points, only the first point represents an equitable steady-state, in which men and women faculty have equal average career lengths and are hired in unchanging proportions." Here, for example, computer science appears to be close to the origin on Figure 1, suggesting that hiring has occurred in "unchanging proportions" over the study interval. However, upon analysis of Table S2, it appears that changes in hiring in Computer Science (+2.26 pp) are relatively large over the study interval compared to other fields. Perhaps I am reading too literally into the phrase that "men and women faculty are hired in unchanging proportions" - but I (and likely others) would benefit from additional clarity here.

      We had created an arrow along with the computer science label in Fig. 1, but it was difficult to see, which is likely the source of this confusion. This was our fault, and we have moved the “Comp. Sci.” label and its corresponding arrow to be more visible in Figure 1.

      Changes in women’s representation in Computer Science due to hiring over 2011 - 2020 was +2.26 pp as the reviewer points out, but, consulting Fig. 1 and the corresponding table in the supplement, we observe that this is a relatively small amount of change compared to most fields.

      Page 3: If possible it may be helpful to cite a study (or multiple) that shows that "changes in women's representation across academic fields have been mostly positive." What does "positive" mean here, particularly when the changes the authors observe are modest? Perhaps by "positive" you mean "perceived as positive"?

      We used the term positive in the mathematical sense, to mean greater than zero. We have reworded the sentence to read “women's representation across academic fields has been mostly increasing…” We hope this change clarifies our meaning to future readers.

      Page 3: The sentence that ends with "even though men are more likely to be at or near retirement age than women faculty due to historical demographic trends" may benefit from a citation (of either Figure S3 or another source).

      We now cite the corresponding figure in this sentence.

      Page 4: The two sentences that begin with "The empirical probability that a person leaves their academic career" would benefit from an added citation.

      We have added a citation to the sentences.

      Figure 3: Which 10 academic domains are represented in Panel 3B? The colors in appear to correspond to the legend in Panel 3A, but no indication of which fields are represented is provided. If possible, please do so - it would be interesting and informative to be able to make these comparisons.

      This was not clear in the initial version of Fig. 3B, so we now label each domain. For reference, the domains represented in 3B are (from top to bottom):

      ● Health

      ● Education

      ● Journalism, Media, Communication

      ● Humanities

      ● Social Sciences

      ● Public Administration and Policy

      ● Medicine

      ● Business

      ● Natural Sciences

      ● Mathematics and Computing

      ● Engineering

      Page 6: Consider citing relevant figure(s) earlier up in paragraph 2 of the discussion. For example, the first sentence could refer to Figure 1 (rather than waiting until the bottom of the paragraph to cite it).

      Thank you for this suggestion, we now cite Fig. 1 earlier in this discussion paragraph.

      Page 10: A minor comment on the fraction of women faculty in any given year-the authors assume that the proportion of women in a field can be calculated from knowing the number of women in a field and the number of men. This is, again, true if assuming binary genders but not true if additional gender diversity is included. It is likely that the number of nonbinary faculty is quite low, and as such would not cause a large change in the overall proportions calculated here, but additional context within the first paragraph of S1 might be helpful for readers.

      We have added additional context in the first paragraph of S1, explaining that an additional term could be added to the equation to account for nonbinary faculty representation if our data included nonbinary gender annotations. Thank you for making this point.

      Page 10: Please include a range of values for the residual terms of the decomposition of hiring and attrition in the sentence that reads "In Figure S1 we show that the residual terms are small, and thus the decomposition is a good approximation of the total change in women's representation."

      These residual terms range from -0.51pp to 1.14pp (median = 0.2pp). We have added this information to the sentence in question.

      Page 12: It may be helpful to readers to include a description of the information contained in Table S2 in the supplemental text under section S3.

      We refer to table S2 twice in the main text (once in the observational findings, and once for the counterfactual analysis), and the contents of table S2 are described thoroughly in the table caption.

      Reviewer #3 (Recommendations For The Authors):

      (1) There is a potential limitation in the generalizability of the findings, as the study focuses exclusively on US academia. Including international perspectives could have provided a more global understanding of the issues at hand.

      The U.S. focus of this study limits the generalizability of our findings, as non-U.S. other faculty may exhibit differences in hiring patterns, retention patterns, and current demographic representations. We have added a discussion of this limitation to the manuscript. Unfortunately, our data do not support international analyses of hiring and attrition.

      (2) I am not sure that everyone who disappeared from the AARC dataset could be count as "attrition" from academia. Indeed, some who disappeared might have completely left academia once they disappeared from the AARC dataset. Yet, there's also the possibility that some professors left for academic positions in countries outside of the US, or US institutions that are not included in the AARC dataset. These individuals didn't leave academia. Furthermore, it is also possible that these scholars who moved to an institution outside of US or not indexed by AARC are gender specific. Therefore, analyses that this study conducts should find a way to test whether the assumption that anyone who disappeared from AARC is indeed valid. If not, how will this potentially challenge the current conclusions?

      The reviewer makes an important point: faculty who move to faculty positions in other countries and faculty who move to non-PhD granting institutions, or to institutions that are otherwise not included in the AARC data are all counted as attritions in our analysis. We intentionally define hiring and attrition broadly to include all cases in which faculty join or leave a field or domain within our dataset.

      The types of transitions that faculty make out of the tenure track system at PhD granting institutions in the U.S. may correlate with faculty attributes, like gender. For example, women or men may be more likely to transition to tenure track positions at non-U.S. institutions. Nevertheless, these types of career transition represent an attrition for the system of study, and a hire for another system. Following this same logic, faculty who transition from one field to another field in our analysis are treated as an attrition from the first field and a hire into the new field.

      By focusing on “all-cause” attrition in this way, we are able to make robust insights for the specific systems we consider (e.g.,, STEM and non-STEM faculty at U.S. PhD granting institutions), without being roadblocked by the task of annotating faculty departures and arbitrating which should constitute “valid” attritions.

      (3) It would be very interesting to know how much of the attribution was due to tenure failure. Previous studies have suggested that women are less likely to be granted tenure, which makes me wonder about the role that tenure plays in the gendered patterns of attrition in academia.

      We note that faculty attrition rates start low and then reach a peak around 5-7 years after earning PhD, and then decline until around 15-20 years post-PhD, after which, attrition rates increase as faculty approach retirement. The first local maximum appears to coincide roughly with the tenure clock timing, but we can only speculate that these attritions are tenure related. Our dataset is unfortunately not equipped to determine the causal mechanisms driving attrition.

      We reproduce the attrition risk curve in the supplementary materials, Fig. S4:

      (4) The dataset used doesn't fully capture the complexities of academic environments, particularly smaller or less research-intensive institutions (regional universities, historically black colleges and universities, and minority-serving institutions). This could be potentially added to the manuscript for discussions.

      We have added this point to the description of this study’s limitations in the discussion.

    2. eLife assessment

      Efforts to increase the representation of women in academia have focussed on efforts to recruit more women and to reduce the attrition of women. This study - which is based on analyses of data on more than 250,000 tenured and tenure-track faculty from the period 2011-2020, and the predictions of counterfactual models - shows that hiring more women has a bigger impact than reducing attrition. The study is an important contribution to work on gender representation in academia, and the evidence in support of the findings is convincing.

    3. Reviewer #1 (Public Review):

      Summary<br /> This is an interesting paper that concludes that hiring more women will do more to improve the gender balance of (US) academia than improving the attrition rates of women (which are usually higher than men's). Other groups have reported similar findings, i.e. that improving hiring rates does more for women's representation than reducing attrition, but this study uses a larger than usual dataset that spans many fields and institutions so it is a good contribution to the field.

      The paper is much improved and far more convincing as a result of the revisions made by the authors.

      Strengths<br /> A large data set with many individuals, many institutions and fields of research.<br /> A good sensitivity analysis to test for potential model weaknesses.

      Weaknesses<br /> Only a single country with a very specific culture and academic system.<br /> Complex model fitting with many steps and possible places for model bias.

    4. Reviewer #3 (Public Review):

      Summary<br /> This study investigates the roles of faculty hiring and attrition in influencing gender representation in U.S. academia. It uses a comprehensive dataset covering tenured and tenure-track faculty across various fields from 2011 to 2020. The study employs a counterfactual model to assess the impact of hypothetical gender-neutral attrition and projects future gender representation under different policy scenarios. The analysis reveals that hiring has a more significant impact on women's representation than attrition in most fields and highlights the need for sustained changes in hiring practices to achieve gender parity.

      The revisions made by the authors have improved the paper.

      Strengths<br /> Overall, the manuscript offers significant contributions to understanding gender diversity in academia through its rigorous data analysis and innovative methodology.

      The methodology is robust, employing extensive data covering a wide range of academic fields and institutions.

      Weaknesses<br /> The primary weakness of the study lies in its focus on U.S. academia, which may limit the generalizability of its findings to other cultural and academic contexts. Additionally, the counterfactual model's reliance on specific assumptions about gender-neutral attrition could affect the accuracy of its projections.

      Additionally, the study assumes that whoever disappeared from the dataset is attrition in academia. While in reality, those attritions could be researchers who moved to another country or another institution that is not indexed by AA.

    1. eLife assessment

      This valuable study describes mice with a knock out of the IQ motif-containing H (IQCH) gene, to model a human loss-of-function mutation in IQCH associated with male sterility. The infertility is reproduced in the mouse, making it a compelling model, but the mechanistic experiments provide only incomplete evidence for interaction between IQCH and potential RNA binding proteins, which are prominently mentioned in the title. The paper, which has undergone multiple rounds of review, could be of interest to cell biologists and male reproductive biologists working on the sperm flagellar cytoskeleton and mitochondrial structure.

    2. Reviewer #3 (Public Review):

      In this study, Ruan et al. investigate the role of the IQCH gene in spermatogenesis, focusing on its interaction with calmodulin and its regulation of RNA-binding proteins. The authors examined sperm from a male infertility patient with an inherited IQCH mutation as well as Iqch CRISPR knockout mice. The authors found that both human and mouse sperm exhibited structural and morphogenetic defects in multiple structures, leading to reduced fertility in Ichq-knockout male mice. Molecular analyses such as mass spectrometry and immunoprecipitation indicated that RNA-binding proteins are likely targets of IQCH, with the authors focusing on the RNA-binding protein HNRPAB as a critical regulator of testicular mRNAs. The authors used in vitro cell culture models to demonstrate an interaction between IQCH and calmodulin, in addition to showing that this interaction via the IQ motif of IQCH is required for IQCH's function in promoting HNRPAB expression. In sum, the authors concluded that IQCH promotes male fertility by binding to calmodulin and controlling HNRPAB expression to regulate the expression of essential mRNAs for spermatogenesis. These findings provide new insight into molecular mechanisms underlying spermatogenesis and how important factors for sperm morphogenesis and function are regulated.

      The strengths of the study include the use of mouse and human samples, which demonstrate a likely relevance of the mouse model to humans; the use of multiple biochemical techniques to address the molecular mechanisms involved; the development of a new CRISPR mouse model; ample controls; and clearly displayed results. Assays are done rigorously and in a quantitative manner. Overall, the claims made by the authors in this manuscript are well-supported by the data provided.

    3. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      By identifying a loss of function mutant of IQCH in infertile patient, Ruan et al. shows that IQCH is essential for spermiogenesis by generating a knockout mouse model of IQCH. Similar to infertile patient with mutant of IQCH, Iqch knockout mice are characterized by a cracked flagellar axoneme and abnormal mitochondrial structure. Mechanistically, IQCH regulates the expression of RNA-binding proteins (especially HNRPAB), which are indispensable for spermatogenesis.

      Although this manuscript contains a potentially interesting piece of work that delineates a mechanism of IQCH that associates with spermatogenesis, this reviewer feels that a number of issues require clarification and re-evaluation for a better understanding of the role of IQCH in spermatogenesis.

      Line 251 - 253, "To elucidate the molecular mechanism by which IQCH regulates male fertility, we performed liquid chromatography tandem mass spectrometry (LC‒MS/MS) analysis using mouse sperm lysates and detected 288 interactors of IQCH (Figure 5-source data 1)."

      The reviewer had already raised significant concerns regarding the text above, noting that "LC‒MS/MS analysis using mouse sperm lysates" would not identify interactors of IQCH. However, this issue was not addressed in the revised manuscript. In the Methods section detailing LC-MS/MS, the authors stated that it was conducted on "eluates obtained from IP". However, there was no explanation provided on how IP for LC-MS/MS was performed. Additionally, it was unclear whether LC-MS or LC-MS/MS was utilized. The primary concern is that if LC‒MS/MS was conducted for the IP of IQCH, IQCH itself should have been detected in the results; however, as indicated by Figure 5-source data 1, IQCH was not listed.

      Thanks to reviewer’s comments. Additional details regarding the IP protocol for LC-MS/MS analysis have been included in the methods section in the revised manuscript. Furthermore, we apologize for the previous inconsistencies in the terminology used for LC-MS/MS and have now ensured its consistent usage throughout the document. Regarding the primary concern about the absence of IQCH in Figure 5-source data 1, our study only showed identifying proteins that interact with IQCH, not IQCH itself. Additionally, we conducted co-IP experiments to validate the interactions identified by LC-MS/MS analysis. Actually, we identified the IQCH itself by LC-MS/MS analysis (Author response table 1).

      Author response table 1.

      Results of the LC-MS/MS analysis.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      The authors should know what experiments have been done for the studies.

      We apologize for our oversights. The method for RNA-binding protein immunoprecipitation (RIP) has been detailed in the revised manuscript.

      Typos still remain in the text, e.g., line 253, "Fiugre".

      We are sorry for the spelling errors. We have engaged professional editing services to refine our manuscript.

    1. eLife assessment

      This study presents an important finding on the function of PLP1+ enteric glia. The evidence supporting the claims of the authors is solid, although the inclusion of additional data showing the mechanisms by which PLP1+ enteric glia acts on Paneth cells would have strengthened the study. The work will be of interest to researchers working on intestinal biology.

    2. Reviewer #1 (Public Review):

      The role of enteric glial cells in regulating intestinal mucosal functions at a steady state has been a matter of debate in recent years. Enteric glial cell heterogeneity and related methodological differences likely underlie the contrasting findings obtained by different laboratories. Here, Prochera and colleagues used Plp1-CreERT2 driver mice to deplete the majority of enteric glia from the gut. They found that glial loss has very limited effects on the transcriptome of gut cells 11 days after tamoxifen treatment (used to induce DTA expression), and by extension - more specifically, has only minimal impact on cells of the intestinal mucosa. Interestingly, in the colon (where Paneth cells are not present) they did observe transcriptomic changes related to Paneth cell biology. Although no overt gene expression alterations were found in the small intestine - also not in Paneth cells - morphological, ultrastructural, and functional changes were detected in the Paneth cells of enteric glia-depleted mice. In addition, and possibly related to Paneth cell dysfunction, enteric glia-depleted mice also show alterations in intestinal microbiota composition.

      In their analyses of enteric glia from existing single-cell transcriptomic data sets, it is stated that these come from 'non-diseased' humans. However, the data on the small intestine is obtained from children with functional gastrointestinal disorders (Zheng 2023). Data on colonic enteric glia was obtained from colorectal cancer patients (Lee 2020). Although here the cells were isolated from non-malignant regions, saying that the large intestines of these patients are non-diseased is probably an overstatement. Another existing dataset including human mucosal enteric glia of healthy subjects is presented in Smillie et al (2019). It would be interesting to see how the current findings relate to the data from Smillie et al.

      The time between enteric glia depletion and analyses (mouse sacrifice) must be a crucial determinant of the type of effects, and the timing thereof. In the current study 11 days after tamoxifen treatment was chosen as the time point for analyses, which is consistent with earlier work by the lab using the same model (Rao et al 2017). What would happen when they wait longer than 11 days after tamoxifen treatment? Data, not necessarily for all parameters, on later time points would strengthen the manuscript significantly.

      The authors found transcriptional dysregulation related to Paneth cell biology in the colon, where Paneth cells are normally not present. Given the bulk RNA sequencing approach, the cellular identity in which this shift is taking place cannot be determined. However, it would be useful if the authors could speculate on which colonic cell type they reckon this is happening in. On the other hand, enteric glia depletion was found to affect Paneth cells structurally and functionally in the small intestine, where transcriptional changes were initially not identified. Only when performing GSEA with the in silico help of cell type-specific gene profiles, differences in Paneth cell transcriptional programs in the small intestine were uncovered. A comment on this discrepancy would be helpful, especially for the non-bioinformatician readers among us.

      From looking at Figure 3B it is clear that Paneth cells are not the only epithelial cell type affected (after less stringent in silico analyses) by enteric glial cell depletion. Although the authors show that this does not translate into ultrastructural or numerical changes of most of these cell types, this makes one wonder how specific the enteric glia - Paneth cell link is. Besides possible indirect crosstalk (via neurons), it is not clear if enteric glia more closely associate with Paneth cells as compared to these other cell types. Immunofluorescence stainings of some of these cells in the Plp1-GFP mice would be informative here. The authors mention IL-22 as a possible link, but do Paneth cells express receptors for transmitters commonly released by enteric glia? Maybe they can have a look at putative cell-cell interactions by mapping ligand-receptor pairs in the scRNAseq datasets they used.

      Previously the authors showed that enteric glia regulation of intestinal motility is sex-dependent (Rao et al 2017). While enteric glia depletion caused dysmotility in female mice, it did not affect motility in males. For this reason, most experiments in the current study were conducted in male mice only. However, for the experiments focusing on the effect of enteric glia depletion on host-microbiome interactions and intestinal microbiota composition both male and female mice were used. In Figure 8A male and female mice are distinctly depicted but this was not done for Figure 8C. Separate characterization of the microbiome of male and female mice would have helped to figure out how much intestinal dysmotility (in females) contributes to the effect on gut microbial composition. This is an important exercise to confirm that the effect on the microbiome is indeed a consequence of altered Paneth cell function, as suggested by the authors (in the results and discussion, and in the abstract). In this context, it would also be interesting to compare the bulk sequencing data after enteric glia depletion between female and male mice.

    3. Reviewer #2 (Public Review):

      This is an excellent and timely study from the Rao lab investigating the interactions of enteric glia with the intestinal epithelium. Two early studies in the late 1990s and early 2000s had previously suggested that enteric glia play a pivotal role in control of the intestinal epithelial barrier, as their ablation using mouse models resulted in severe and fatal intestinal inflammation. However, it was later identified that these inflammatory effects could have been an indirect product of the transgenic mouse models used, rather than due to the depletion of enteric glia. In previous studies from this lab, the authors had identified expression of PLP1 in enteric glia, and its use in CRE driver lines to label and ablate enteric glia.

      In the current paper, the authors carefully examine the role of enteric glia by first identifying that PLP1-creERT2 is the most useful driver to direct enteric glial ablation, in terms of the number of glial cells targeted, their proximity to the intestinal epithelium, and the relevance for human studies (GFAP expression is rather limited in human samples in comparison). They examined gene expression changes in different regions of the intestine using bulk RNA-seq following ablation of enteric glia by driving expression of diphtheria toxin A (PLP1-creERT2;Rosa26-DTA). Alterations in gene expression were observed in different regions of the gut, with specific effects in different regions. Interestingly, while there were gene expression changes in the epithelium, there were limited changes to the proportions of different epithelial cell types identified using immunohistochemistry in control vs glial-ablated mice. The authors then focused on the investigation of Paneth cells in the ileum, identifying changes in the ultrastructural morphology and lysozyme activity. In addition, they identified alterations in gut microbiome diversity. As Paneth cells secrete antimicrobial peptides, the authors conclude that the changes in gut microbiome are due to enteric glia-mediated impacts on Paneth cell activity.

      Overall, the study is excellent and delves into the different possible mechanisms of action, including the investigation of changes in enteric cholinergic neurons innervating the intestinal crypts. The use of different CRE drivers to target enteric glial cells has led to varying results in the past, and the authors should be commended on how they address this in the Discussion.

    4. Reviewer #3 (Public Review):

      In this study, Prochera, et al. identify PLP1+ cells as the glia that most closely interact with the gut epithelium and show that genetic depletion of these PLP1+ glia in mice does not have major effects on the intestinal transcriptome or the cellular composition of the epithelium. Enteric glial loss, however, causes dysregulation of Paneth cell gene expression that is associated with morphological disruption of Paneth cells, diminished lysozyme secretion, and altered gut microbial composition. Overall, the authors need to first prove whether the Plp1CreER Rosa26DTA/+ mice system is viable. Also, most experimental systems have been evaluated by immunohistochemistry, scRNAseq, and electron microscopy, but need quantitative statistical processing. In addition, the value of the paper would be enhanced if the significance of why the phenotype appeared in the large intestine rather than the small intestine when PLP1 is deficient for Paneth cells is clarified.

      Weaknesses:

      Major:

      (1) Supplementary Figure 2; Cannot be evaluated without quantification.

      (2) Figure 2A; Is Plp1CreER Rosa26DTA/+ mice system established correctly? S100B immunohistology picture is not clear. A similar study is needed for female Plp1CreER Rosa26DTA/+ mice. What is the justification for setting 5 dpt, 11 dpt? Any consideration of changes to organs other than the intestine? Wouldn't it be clearer to introduce Organoid technology?

      3) Figure 2B; Need an explanation for the 5 genes that were altered in the colon. Five genes should be evaluated by RT-qPCR. Why was there a lack of change in the duodenum and ileum?

      (4) Supplementary Figure 3; Top 3 genes should be evaluated by RT-qPCR.

      (5) Supplementary Figure 4B, C, and D; Why not show analysis in the small intestine?

      (6) Supplementary Figure 4D; Cannot be evaluated without quantification.

      (7) Figure 3D; Cannot be evaluated without quantification.

      (8) Supplementary Figure 5B and C; Top 3 genes should be evaluated by RT-qPCR.

      (9) Supplementary Figure 6; Top 3 genes should be evaluated by RT-qPCR.

      (10) Figure 4A; Cannot be evaluated without quantification.

      (11) Figure 4D; Cannot be evaluated without quantification.

      (12) Additional experiments on in vivo infection systems comparing Plp1CreER Rosa26DTA/+ mice and controls would be great.

    5. Author response:

      We thank the reviewers for their thoughtful consideration of our study and are delighted they found the findings to be important. In this initial response to the overall positive reviews, we want to address common themes raised, clarify points relevant to a few specific reviewer concerns, and frame plans for the revised manuscript.

      (1) Analysis of data from human tissue: Reviewer 1 notes “In their analyses of enteric glia from existing single-cell transcriptomic data sets, it is stated that these come from 'non-diseased' humans. However, the data on the small intestine is obtained from children with functional gastrointestinal disorders (Zheng 2023). Data on colonic enteric glia was obtained from colorectal cancer patients (Lee 2020). Although here the cells were isolated from non-malignant regions, saying that the large intestines of these patients are non-diseased is probably an overstatement.

      In the Zheng et al. dataset, “functional GI disorders” refers to biopsies from children that do not have any histopathologic evidence of digestive disease. The children do, however, have at least one GI symptom that prompted a diagnostic endoscopy with biopsies, leading to the designation of “functional” disorder. Given that diagnostic endoscopies are invasive procedures that necessitate anesthesia, obtaining biopsies from completely healthy, asymptomatic children without any clinical indication would not be allowable per most institutional review boards, leading the authors of that study to use these samples as a control group. We thus used the “non-diseased” label to encompass these samples as well as those from the unaffected regions of large intestine from colorectal cancer patients. We recognize, however, that this label might be misleading and will revise the manuscript to more accurately reflect the information on control tissue origin.

      Another existing dataset including human mucosal enteric glia of healthy subjects is presented in Smillie et al (2019). It would be interesting to see how the current findings relate to the data from Smillie et al.” 

      We thank the reviewer for directing us to the Smillie et al. 2019 dataset. This dataset derives from colonic mucosal biopsies from 12 healthy adults (8480 stromal cells) and 18 adults with ulcerative colitis (10,245 stromal cells from inflamed bowel segments and 13,146 from uninflamed), all between the ages of 20-77 years. Our preliminary analysis shows that the putative glial cluster in this dataset does not separate by inflammation or disease state based on the common glial genes: S100B, PLP1, and SOX10. PLP1 and S100B are broadly expressed across this cluster while GFAP is not detected in this dataset, consistent with our observations from the two other human datasets included in our manuscript. In the revised manuscript, we will include the Smillie et al. 2019 data in a supplemental figure as additional supportive evidence.

      (2) Validation and further details of the Plp1CreER-DTA model for genetic depletion of enteric glia: Reviewer 1 notes “The time between enteric glia depletion and analyses (mouse sacrifice) must be a crucial determinant of the type of effects, and the timing thereof. In the current study 11 days after tamoxifen treatment was chosen as the time point for analyses, which is consistent with earlier work by the lab using the same model (Rao et al 2017). What would happen when they wait longer than 11 days after tamoxifen treatment?”  Reviewer 3 asks whether “the Plp1CreER Rosa26DTA/+ mice system established correctly” and raises concern about quantitative characterization.

      In previous work, we discovered that the gene Plp1 is broadly expressed by enteric glia and, within the mouse intestine, is quite specific to glial cells (PMID: 26119414). We characterized the Plp1CreER mouse line as a genetic tool in detail in this initial study. Then in a subsequent study, we used Plp1CreER-DTA mice to genetically deplete enteric glia and study the consequences on epithelial barrier integrity, crypt cell proliferation, enteric neuronal health and gastrointestinal motility (PMID: 28711628). In this second study, we performed extensive validation of the Plp1CreER-DTA mouse model including detailed quantification of glial depletion in the small and large intestines across the myenteric, intramuscular and mucosa compartments by immunohistochemical (IHC) staining of whole tissue segments to sample thousands of cells. We found that the majority of S100B+ enteric glia were depleted within 5 days in both sexes, including more than 88% loss of mucosal glia, and that this loss was stable at 3 subsequent timepoints (7, 9 and 14 days post-tamoxifen induction of Cre activity). Glial loss was further confirmed by IHC for GFAP in the myenteric plexus, and by ultrastructural analysis of the small intestine to ensure cell depletion rather than simply loss of marker expression. Our group was the first to use this model to study enteric glia, and since then similar models and our key observations have been replicated by other groups (PMID: 33282743, 34550727). Thus, we consider this model to be well established.

      Reviewer 1 raises an excellent question about examining epithelial health beyond 11 days post-tamoxifen (11dpt) in this model. Particularly given the longer-lived nature of Paneth cells relative to other epithelial cell types, this would be very interesting to explore. Through 11dpt, Cre+ mice are well-appearing and indistinguishable from their Cre-negative control littermates. Unfortunately, a limitation of the Plp1CreER-DTA model is that beyond 11dpt, Cre+ mice become anorexic, lose body weight, and have signs of neurologic debility such as hindlimb weakness and uncoordinated gait that are prominent by 14dpt. These phenotypes are likely the consequence of targeting Plp1+ glia outside the gut, such as Schwann cells and oligodendrocytes (as described in another study which used a similar model to study demyelination in the central nervous system, PMID: 20851998). Given these CNS effects and that starvation is well known to affect Paneth cell phenotypes (PMIDs: 1167179, 21986443), we elected not to examine timepoints beyond 11dpt. Technological advances that enable more selective cell depletion would allow study of more chronic effects of enteric glial loss.

      (3) Sex differences in the microbiome data: All 3 reviewers queried whether there were sex differences in the microbiome data with Reviewer 1 explaining “Previously the authors showed that enteric glia regulation of intestinal motility is sex-dependent (Rao et al 2017). While enteric glia depletion caused dysmotility in female mice, it did not affect motility in males. For this reason, most experiments in the current study were conducted in male mice only. However, for the experiments focusing on the effect of enteric glia depletion on host-microbiome interactions and intestinal microbiota composition both male and female mice were used. In Figure 8A male and female mice are distinctly depicted but this was not done for Figure 8C. Separate characterization of the microbiome of male and female mice would have helped to figure out how much intestinal dysmotility (in females) contributes to the effect on gut microbial composition. This is an important exercise to confirm that the effect on the microbiome is indeed a consequence of altered Paneth cell function…”

      In our microbiome analysis, we initially analyzed males and females separately but did not observe significant differences between the two sexes. Thus, we merged the data to increase the statistical power of the genotype comparisons. It was an oversight on our part to not label the female and male datapoints in Figure 8C as we did for the other data in the manuscript. We will update this graph and related supplemental figures in the revised version. Per Reviewer 2’s suggestion, we will also address this further in the Results and Discussion.

      (4) Reconciling RNA-Seq identification of transcriptional changes in the colon, but not the small intestine, while the GSEA and downstream tissue level morphological and functional analyses detected phenotypes in the small intestine. Reviewers 1 and 3 raised this question with Reviewer 1 noting “…enteric glia depletion was found to affect Paneth cells structurally and functionally in the small intestine, where transcriptional changes were initially not identified. Only when performing GSEA with the in silico help of cell type-specific gene profiles, differences in Paneth cell transcriptional programs in the small intestine were uncovered. A comment on this discrepancy would be helpful, especially for the non-bioinformatician readers among us.” 

      Standard differential gene expression analysis (DEG) of the effects of glial loss revealed significant differences only in the colon, and even there only a handful of genes were changed. These changes were not accompanied by corresponding changes at the protein level, at least as detectable by IHC. In the small intestine, there were no significant differences by standard DEG thresholds. Unlike DEG, gene set enrichment analyses (GSEA), provides a significance value based on whether there is a higher than chance number of genes that are changing in a uniform direction without consideration for the significance of the magnitude of change. Therefore, the GSEA detected that a significant number of genes in the curated Paneth cell gene list exhibited a positive fold change difference in the bulk RNA sequencing data. This prompted us to examine Paneth cells and other epithelial cell types in more detail by IHC, functional and ultrastructural analyses, which all converged on the observation that Paneth cells were relatively selectively disrupted in the epithelium of glial depleted mice.

      (5) Other: We will address all remaining comments in our detailed author response that will accompany our revised manuscript. We thank Reviewer 2 for the very positive feedback overall and highlighting opportunities to better label findings in some of the figures. We will make these suggested changes in our revised manuscript.

    1. eLife assessment

      This valuable study provides solid in vivo data that transfer of IL-15/IL-12-conditioned syngeneic NK cells after primary tumor resection promotes long-term survival of mice with low metastatic burden from breast cancer. Also, the authors conducted an investigator-initiated clinical trial that demonstrated that similar NK cell infusions in cancer patients after resections were safe and showed signs of efficacy. Therefore, this study is of interest and value to oncologists in the field of breast cancer research.

    2. Reviewer #1 (Public Review):

      Summary:

      This is a very nice paper in which the authors addressed the potential for NK cell cellular therapy to treat and potentially eliminate previously established metastases after surgical resections, which are a major cause of death in human cancer patients. To do so they developed a model using the EO771 breast cancer cell line, in which they establish and then resect tumors and the draining lymph node, after which the majority of mice eventually succumb to metastatic disease. They found that when the initiating tumors were resected when still relatively small, adoptive transfers of IL-15/12-conditioned NK cells substantially enhanced the survival of tumor-bearing animals. They then delved into the cellular mechanisms involved. Interestingly and somewhat unexpectedly, the therapeutic effect of the transferred NK cells was dependent on the host's CD8+ T cells. Accordingly, the NK cell therapy contributed to the formation of tumor-specific CD8+ T cells, which protected the recipient animals against tumor re-challenge and were effective in protecting mice from tumor formation when transferred to naive mice. Mechanistically, they used Ifng knockout NK cells to provide evidence that IFNgamma produced by the transferred NK cells was crucial for the accumulation and activation of DCs in the metastatic lung, including expression of CD86, CD40, and MHC genes. In turn, IFNgamma production by NK cells was essential for the induced accumulation of activated CD8 effector T cells and stem cell-like CD8 T cells in the metastatic lung. The authors then expanded their findings from the mouse model to a small clinical trial. They found that inoculations of IL-15/12-conditioned autologous NK cells in patients with various malignancies after resection were safe and showed signs of efficacy.

      Strengths:

      - Monitoring of long-term metastatic disease and survival after resection used in this paper is a physiological model that closely resembles clinical scenarios more than the animal models usually used, a great strength of the approach.

      - Previous literature focused on the notion that NK cells clear metastatic lesions directly, within a short period. The authors' use of a more relevant model and time frame revealed the previously unexplored T cell-dependent mechanism of action of infused NK cells for long-term control of metastatic diseases.

      - Also important, the paper provides solid evidence for the contribution of IFNgamma produced by NK cells for activation of dendritic cells and T cells. This is an interesting finding that provokes additional questions concerning the action of the interferon-gamma in this context.

      - The results from the clinical trial in cancer patients based on the same type of IL-15/12-conditioned NK cell infusions, were encouraging with respect to safety and showed signals of efficacy, which support the translatability of the author's findings.

      Weaknesses:

      - Having demonstrated that NK cell IFNgamma is important for recruiting and activating DCs and T cells in their model, one is left to wonder whether it is important for the therapeutic effect, which was not tested.

      - Relatedly, previous studies, cited by the authors, reported that NK cells promote T cell activation by producing the chemokines CCL5 and XCL1, and FLT3 ligand, which respectively recruit and activate dendritic cells that can subsequently mobilize a T cell response. The present study demonstrates an important role for NK cell-produced IFNgamma in these processes. One is left wondering whether the model used by the authors is also dependent on CCL5, XCL1, and FLT3 production by NK cells, and if so whether IFNgamma plays a role in that or acts in parallel. The issue could be discussed by the authors, even if they cannot easily resolve it.

      - The authors do not address whether the IL-12 in their cocktail is essential for the effects they see. Relatedly, it was of interest that despite the effectiveness of the transferred IL-15/IL-12 cultured NK cells, the cells failed to persist very long after transfer. Published studies have reported that so-called memory-like NK cells, which are pre-activated with a cocktail of IL-12, IL-18 and IL-15, persist much longer in lympho-depleted mice and patients than IL-2 cultured NK cells. It would be illuminating to compare these two types of NK cell products in the author's model system, and with, or without, lymphodepletion, to identify the critical parameters. If greater persistence occurred with the memory-like NK cell product, it is possible that the NK cells might provide greater benefit, including by directly targeting the tumor.

      - It was somewhat difficult to gauge the clinical trial results because the trial was early stage and therefore not controlled. Evaluation of the results therefore relies on historical comparisons. To evaluate how encouraging the results are, it would be valuable for the authors to provide some context on the prognoses and likely disease progression of these patients at the time of treatment.

    3. Reviewer #2 (Public Review):

      Summary:

      The authors show convincing data that increasing NK cell function/frequency can reduce the development and progression of metastatic disease after primary tumor resection.

      Strengths:

      The inclusion of a first-in-human trial highlighting some partial responses of metastatic patients treated with in vitro expanded NK cells is tantalising. It is difficult to perform trials in preventing further metastasis since the timelines are very protracted. However, more data like these that highlight the role of NK cells in improving local cDC1/T cells anti-tumor immunity will encourage deeper thinking around therapeutic approaches to target endogenous NK cells to achieve the same.

      Weaknesses:

      As always, more patient data would help increase confidence in the human relevance of the approach.

    1. eLife assessment

      This valuable study adopted a multi-omics approach to elucidate the regulatory mechanism underlying parturition and myometrial quiescence. The data presented to support the main conclusion remains incomplete. This work will be of interest to both basic researchers who work on reproductive biology and clinicians who practice reproductive medicine.

    2. Reviewer #1 (Public Review):

      Summary:

      The use of a multi-omics approach to elucidate the regulatory mechanism underlying parturition and myometrial quiescence adds novelty to the study. The identification of myometrial cis-acting elements and their association with gene expression, particularly the regulation of the PLCL2 gene by PGR opens the door to further investigate the impact of PGR and other regulators.

      Strengths:

      (1) Multi-Omic Approach: The paper employs a comprehensive multi-omic approach, combining ChIP-Seq, RNA-Seq, and CRISPRa-based Perturb-Seq assays, which allow for a thorough investigation of the regulatory mechanisms underlying myometrial gene expression.

      (2) Clinical Relevance: Investigating human myometrial specimens provides direct clinical relevance, as understanding the molecular mechanisms governing parturition and myometrial quiescence can have significant implications for the management of pregnancy-related disorders.

      (3) Functional work: For functional screening, They have used CRISPRa-based screening of PLCL2 gene regulation using immortalized human cell-line hTERT-HM and T-hESC to add more dimension to the work which strengthens their finding of PGR-dependent regulation of the PLCL2 gene in the human myometrial cells.

      Weaknesses:<br /> (1) Variability in epigenomic mapping: The significant variations in the number and location of H3K27ac-positive intervals across different samples and studies suggest potential challenges in accurately mapping the myometrial epigenome. This variability may introduce uncertainty and complicate the interpretation of results.

      (2) Sample specificity: The study focuses on term pregnant nonlabor myometrial specimens, limiting the generalizability of the findings to other stages of pregnancy or labor.

      (3) Limited Understanding of Regulatory Mechanisms: While the study identifies potential regulatory programs within super-enhancers, the exact mechanisms by which these enhancers regulate gene expression and cellular functions in the myometrium remain unclear. Further mechanistic studies are needed to elucidate these processes.

      (4) Discordant analysis: Why are regular enhancers being understood in terms of motif enrichment of transcription factors and super-enhancers in terms of pathways enriched for active genes? This needs a clear reason.

    3. Reviewer #2 (Public Review):

      Summary:

      In "Assessment of the Epigenomic Landscape in Human Myometrium at Term Pregnancy" the authors generate a number of genome-wide data sets to investigate epigenomic and transcriptomic regulation of the myometrium at term pregnancy. These data provide a useful resource for further evaluation of gene regulatory mechanisms in the myometrium and include the first Hi-C data published for this tissue. There is a comprehensive comparison to previously published histone modification data and integration with RNA-seq to highlight potential enhancer-gene regulatory relationships. The authors further investigate putative enhancers upstream of the PLCL2 gene and identify a candidate region that may be regulated by the PGR (progesterone receptor) signaling.

      Strengths:

      The strengths of this study are in the multi-omics nature of the design as several genome-wide data sets are generated from the same patient samples. Extending this type of approach in the future to a larger number of samples will allow for additional investigation into gene regulation as the correlation between epigenomic features and gene expression across a larger number of samples can reveal regulatory relationships.

      Weaknesses:

      One of the most interesting aspects of this study is the generation of the first Hi-C data for the human pregnant myometrium, however, there is a minimal description in the results section of the Hi-C data analysis and the only data shown are the number of loops identified and one such loop that includes the PLCL2 promoter shown in Figure 3A. The manuscript would benefit from a more extensive analysis of the Hi-C data, for example, the analysis of TADs (topological associating domains) would be interesting to add and could be used to evaluate to what extent H3K27ac domains and putative regulated genes fall within the same TAD.

      The authors present some convincing evidence on the transcriptional regulation of the PLCL2 gene using Perturb-Seq to identify putative upstream enhancer regions and PGR over-expression showing PGR can act as an activator. These two experiments on their own are interesting, however, they are not as mechanistically integrated as they could be to clarify the molecular mechanisms. Deletion of the putative enhancer upstream of PLCL2 followed by over-expression of PGR would clarify the mechanistic relationship between the proposed enhancer, PGR, and PLCL2 expression. Does PGR act through the proposed enhancer? In addition, reporter assays using this proposed enhancer region with and without increased expression of PGR and mutation of any PRE sequences would also provide mechanistic insight. Although CRISPRa and Perturb-Seq can be used to identify potential regulatory regions, the best approach to verify the requirement for a particular enhancer in regulating a specific gene is a deletion approach.

    4. Reviewer #3 (Public Review):

      In this manuscript, Wu et al. investigate active H3K27ac and H3K4me1 marks in term pregnant nonlabor myometrial biopsies, linking putative-enhancers and super-enhancers to gene expression levels. Through their findings, they reveal the PGR-dependent regulation of the PLCL2 gene in human myometrial cells via a cis-acting element located 35-kilobases upstream of the PLCL2 gene. By targeting this region using a CRISPR activation system, they were able to elevate the endogenous PLCL2 mRNA levels in immortalized human myometrial cells.

      This research offers novel insights into the molecular mechanisms governing gene expression in myometrial tissues, advancing our understanding of pregnancy-related processes.

      Major comments:

      (1) A more comprehensive analysis of the epigenetic and transcriptomic data would have strengthened the paper, moving beyond basic association studies. Currently, it is challenging to assess the quality and significance of the data as much of the information is lacking.

      (2) The rationale for and connections between experiments, as well as results, could be bolstered to underscore the significance of this research.

      Strengths:

      - The combination of ChIP-Seq, RNA-Seq, and CRISPRa Perturb-Seq approaches to investigate gene regulation and expression in myometrial cells.

      - The use of CRISPR activation system to specifically target cis-acting elements.

      Weaknesses:

      - The manuscript would strongly benefit from a deeper analysis of the Omic datasets. Furthermore, expanding figures/graphs to effectively contextualize these datasets would be greatly beneficial and would add more value to this research. Currently, it is difficult for us to assess and appreciate the quality of these data sets across the manuscript, which is mostly correlative.

      - Limited sample size, coupled with variability in results and overall lack of details, compromises the robustness of result interpretation.

      - For most parts of the results section, a better description is needed, including rationale, approach, and presentation of data. As it stands, it is challenging to assess the quality of the data and appreciate the results.

      - Additional efforts are needed to dissect the proposed regulatory mechanisms.

      - While the discussion provided helpful context for understanding some of the experiments performed, it lacked interpretation of the results in relation to the existing literature.

    1. eLife assessment

      In this valuable study, the authors sought to investigate the associations of age at breast cancer onset with the incidence of myocardial infarction and heart failure. Based on results from a series of solid statistical analyses, the authors conclude that a younger onset age of breast cancer is associated with myocardial infarction and heart failure, highlighting the need to carefully monitor the cardiovascular status of women who have been diagnosed with breast cancer.

    2. Reviewer #1 (Public Review):

      Summary:

      The authors sought to investigate the associations of age at breast cancer onset with the incidence of myocardial infarction (MI) and heart failure (HF). They employed a secondary data analysis of the UK Biobank. They used descriptive and inferential analysis including Cox proportional hazards models to investigate the associations. Propensity score matching was also used. They found that Among participants with breast cancer, younger onset age was significantly associated with elevated risks of MI (HR=1.36, 95%CI: 1.19 to 1.56, P<0.001) and HF (HR=1.31, 95% CI: 1.18 to 1.46, P<0.001). the reported similar findings after propensity matching.

      Strengths:

      The use of a large dataset is a strength of the study as the study is well-powered to detect differences. Reporting both the unmatched and the propensity-matched estimates was also important for statistical inference.

      Weaknesses:

      Despite the merits of the paper, readers may get confused as to whether authors are referring to "age at breast cancer onset" or "age at breast cancer diagnosis". I suppose the title refers to the latter, in which case it will be best to be consistent in using "age at breast cancer diagnosis" throughout the manuscripts. I would recommend a revision to the title to make it explicit that the authors are referring to, "age at breast cancer diagnosis".

    3. Reviewer #2 (Public Review):

      This is a well-presented large analysis from the UK Biobank of nearly 250,000 female adults. The authors examined the associations of breast cancer diagnosis with incident myocardial infarction and heart failure by different onset age groups. Based on results from a series of statistical analyses, the authors concluded that younger onset age of breast cancer was associated with myocardial infarction and heart failure, highlighting the necessity of careful monitoring of cardiovascular status in women diagnosed with breast cancer, especially those younger ones.

      Comments to consider:

      (1) It's thoughtful for the authors to have included and adjusted for menopausal status, breast cancer surgery, and hormone replacement therapy in their sensitivity analysis. It would be informative if the authors presented the number and percentages of menopause and cancer treatments.

      (2) The analytical baseline used for follow-up should be pointed out in the methods section. It's confusing whether the analytic baseline was defined as the study baseline or the time at breast cancer diagnosis.

      (3) Did the older onset age group have a longer follow-up duration? Could the authors provide information on the length of follow-up by age of onset in Supplementary Table S4? It would give the readers more information regarding different age groups.

    1. eLife assessment

      This study combines genetic, cell biological, and interaction data to propose a model of meiotic double-strand break regulation in C. elegans. Comprehensive cataloging of their interactions (physical and genetic) would be valuable information for the field. However, the analyses used in the manuscript are not consistent or comprehensive, and therefore the evidence to support their model is currently incomplete.

    2. Reviewer #1 (Public Review):

      Summary:

      The manuscript by Raices et al., provides novel insights into the role and interactions between SPO-11 accessory proteins in C. elegans. The authors propose a model of meiotic DSBs regulation, critical to our understanding of DSB formation and ultimately crossover regulation and accurate chromosome segregation. The work also emphasizes the commonalities and species-specific aspects of DSB regulation.

      Strengths:

      This study capitalizes on the strengths of the C. elegans system to uncover genetic interactions between a large number of SPO-11 accessory proteins. In combination with physical interactions, the authors synthesize their findings into a model, which will serve as the basis for future work, to determine mechanisms of DSB regulation.

      Weaknesses:

      The methodology, although standard, lacks quantification. This includes the mass spectrometry data, along with the cytology. The work would also benefit from clarifying the role of the DSB machinery on the X chromosome versus the autosomes.

    3. Reviewer #2 (Public Review):

      Summary:

      Meiotic recombination initiates with the formation of DNA double-strand break (DSB) formation, catalyzed by the conserved topoisomerase-like enzyme Spo11. Spo11 requires accessory factors that are poorly conserved across eukaryotes. Previous genetic studies have identified several proteins required for DSB formation in C. elegans to varying degrees; however, how these proteins interact with each other to recruit the DSB-forming machinery to chromosome axes remains unclear.

      In this study, Raices et al. characterized the biochemical and genetic interactions among proteins that are known to promote DSB formation during C. elegans meiosis. The authors examined pairwise interactions using yeast two-hybrid (Y2H) and co-immunoprecipitation and revealed an interaction between a chromatin-associated protein HIM-17 and a transcription factor XND-1. They further confirmed the previously known interaction between DSB-1 and SPO-11 and showed that DSB-1 also interacts with a nematode-specific HIM-5, which is essential for DSB formation on the X chromosome. They also assessed genetic interactions among these proteins, categorizing them into four epistasis groups by comparing phenotypes in double vs. single mutants. Combining these results, the authors proposed a model of how these proteins interact with chromatin loops and are recruited to chromosome axes, offering insights into the process in C. elegans compared to other organisms.

      Weaknesses:

      This work relies heavily on Y2H, which is notorious for having high rates of false positives and false negatives. Although the interactions between HIM-17 and XND-1 and between DSB-1 and HIM-5 were validated by co-IP, the significance of these interactions was not tested, and cataloging Y2H interactions does not yield much more insight. Moreover, most experiments lack rigor, which raises serious concerns about whether the data convincingly supports the conclusions of this paper. For instance, the XND-1 antibody appears to detect a band in the control IP; however, there was no mention of the specificity of this antibody. Additionally, epistasis analysis of various genetic mutants is based on the quantification of DAPI bodies in diakinesis oocytes, but the comparisons were made without statistical analyses. For cytological data, a single representative nucleus was shown without quantification and rigorous analysis. The rationale for some experiments is also questionable (e.g. the rescue by dsb-2 mutants by him-5 transgenes in Figure 2), making the interpretation of the data unclear. Overall, while this paper claims to present "the first comprehensive model of DSB regulation in a metazoan", cataloging Y2H and genetic interactions did not yield any new insights into DSB formation without rigorous testing of their significance in vivo. The model proposed in Figure 4 is also highly speculative.

    4. Reviewer #3 (Public Review):

      During meiosis in sexually reproducing organisms, double-strand breaks are induced by a topoisomerase-related enzyme, Spo11, which is essential for homologous recombination, which in turn is required for accurate chromosome segregation. Additional factors control the number and genome-wide distribution of breaks, but the mechanisms that determine both the frequency and preferred location of meiotic DSBs remain only partially understood in any organism.

      The manuscript presents a variety of different analyses that include variable subsets of putative DSB factors. It would be much easier to follow if the analyses had been more systematically applied. It is perplexing that several factors known to be essential for DSB formation (e.g., cohesins, HORMA proteins) are excluded from this analysis, while it includes several others that probably do not directly contribute to DSB formation (XND-1, HIM-17, CEP-1, and PARG-1). The strongest claims seem to be that "HIM-5 is the determinant of X-chromosome-specific crossovers" and "HIM-5 coordinates the actions of the different accessory factors sub-groups." Prior work had already shown that mutations in him-5 preferentially reduce meiotic DSBs on the X chromosome. While it is possible that HIM-5 plays a direct role in DSB induction on the X chromosome, the evidence presented here does not strongly support this conclusion. It is also difficult to reconcile this idea with evidence from prior studies that him-5 mutations predominantly prevent DSB formation on the sex chromosomes, while the protein localizes to autosomes. The one experiment that seems to elicit the conclusion that HIM-5 expression is sufficient for breaks on the X chromosome is flawed (see below). The conclusion that HIM-5 "coordinates the activities of the different accessory sub-groups" is not supported by data presented here or elsewhere.

      Like most other studies that have examined DSB formation in C. elegans, this work relies on indirect assays, here limited to the cytological appearance of RAD-51 foci and bivalent chromosomes, as evidence of break formation or lack thereof. Unfortunately, neither of these assays has the power to reveal the genome-wide distribution or number of breaks. These assays have additional caveats, due to the fact that RAD-51 association with recombination intermediates and successful crossover formation both require multiple steps downstream of DSB induction, some of which are likely impaired in some of the mutants analyzed here. This severely limits the conclusions that can be drawn. Given that the goal of the work is to understand the effects of individual factors on DSB induction, direct physical assays for DSBs should be applied; many such assays have been developed and used successfully in other organisms.

      Throughout the manuscript, the writing conflates the roles played by different factors that affect DSB formation in very different ways. XND-1 and HIM-17 have previously been shown to be transcription factors that promote the expression of many germline genes, including genes encoding proteins that directly promote DSBs. Mutations in either xnd-1 or him-17 result in dysregulation of germline gene expression and pleiotropic defects in meiosis and fertility, including changes in chromatin structure, dysregulation of meiotic progression, and (for xnd-1) progressive loss of germline immortality. It is thus misleading to refer to HIM-17 and XND-1 as DSB "accessory factors" or to lump their activities with those of other proteins that are likely to play more direct roles in DSB induction. For example, statements such as the following sentence in the Introduction should be omitted or explained more clearly: "xnd-1 is also unique among the accessory factors in influencing the timing of DSBs; in the absence of xnd-1, there is precocious and rapid accumulation of DSBs as monitored by the accumulation of the HR strand-exchange protein RAD-51." The evidence that HIM-17 promotes the expression of him-5 presented here corroborates data from other publications, notably the recent work of Carelli et al. (2022), but this conclusion should not be presented as novel here. The other factors also fall into several different functional classes, some of which are relatively well understood, based largely on studies in other organisms. The roles of RAD-50 and MRE-11 in DSB induction have been investigated in yeast and other organisms as well as in several prior studies in C. elegans. DSB-1, DSB-2, and DSB-3 are homologs of relatively well-studied meiotic proteins in other organisms (Rec114 and Mei4) that directly promote the activity of Spo11, although the mechanism by which they do so is still unclear. Mutations in PARG-1 (a Poly-ADP ribose glycohydrolase) likely affect the regulation of poly-ADP-ribose addition and removal at sites of DSBs, which in turn are thought to regulate chromatin structure and recruitment of repair factors; however, there is no convincing evidence that PARG-1 directly affects break formation. CEP-1 is a homolog of p53 and is involved in the DNA damage response in the germline, but again is unlikely to directly contribute to DSB induction. HIM-5 and REC-1 do not have apparent homologs in other organisms and play poorly understood roles in promoting DSB induction. A mechanistic understanding of their functions would be of value to the field, but the current work does not shed light on this. A previous paper (Chung et al. G&D 2015) concluded that HIM-5 and REC-1 are paralogs arising from a recent gene duplication, based on genetic evidence for a partially overlapping role in DSB induction, as well as an argument based on the genomic location of these genes in different species; however, these proteins lack any detectable sequence homology and their predicted structures are also dissimilar (both are largely unstructured but REC-1 contains a predicted helical bundle lacking in HIM-5). Moreover, the data presented here do not reveal overlapping sets of genetic or physical interactions for the two genes/proteins. Thus, this earlier conclusion was likely incorrect, and this idea should not be restated uncritically here or used as a basis to interpret phenotypes.

      DSB-1 was previously reported to be strictly required for all DSB and CO formation in C. elegans. Here the authors test whether the expression of HIM-5 from the pie-1 promoter can rescue DSB formation in dsb-1 mutants, and claim to see some rescue, based on an increase in the number of nuclei with one apparent bivalent (Figure 2C). This result seems to be the basis for the claim that HIM-5 coordinates the activities of other DSB proteins. However, this assay is not informative, and the conclusion is almost certainly incorrect. Notably, a substantial number of nuclei in the dsb-1 mutant (without Ppie-1::him-5) are reported as displaying a single bivalent (11 DAPI staining bodies) despite prior evidence that DSBs are absent in dsb-1 mutants; this suggests that the way the assay was performed resulted in false positives (bivalents that are not actually bivalents), likely due to inclusion of nuclei in which univalents could not be unambiguously resolved in the microscope. A slightly higher level of nuclei with a single unresolved pair of chromosomes in the dsb-1; Ppie-1::him-5 strain is thus not convincing evidence for rescue of DSBs/CO formation, and no evidence is presented that these putative COs are X-specific. The authors should provide additional experimental evidence - e.g., detection of RAD-51 and/or COSA-1 foci or genetic evidence of recombination - or remove this claim. The evidence that expression of Ppie-1::him-5 may partially rescue DSB abundance in dsb-2 mutants is hard to interpret since it is currently unknown why C. elegans expresses 2 paralogs of Rec114 (DSB-1 and DSB-2), and the age-dependent reduction of DSBs in dsb-2 mutants is not understood.

      Several of the factors analyzed here, including XND-1, HIM-17, HIM-5, DSB-1, DSB-2, and DSB-3, have been shown to localize broadly to chromatin in meiotic cells. Co-immunoprecipitation of pairs of these factors, even following benzonase digestion, is not strong evidence to support a direct physical interaction between proteins. Similarly, the super-resolution analysis of XND-1 and HIM-17 (Figure 1EF) does not reveal whether these proteins physically interact with each other, and does not add to our understanding of these proteins' functions, since they are already known to bind to many of the same promoters. Promoters are also likely to be located in chromatin loops away from the chromosome axis, so in this respect, the localization data are also confirmatory rather than novel.

      The phenotypic analysis of double mutant combinations does not seem informative. A major problem is that these different strains were only assayed for bivalent formation, which (as mentioned above) requires several steps downstream of DSB induction. Additionally, the basis for many of the single mutant phenotypes is not well understood, making it particularly challenging to interpret the effects of double mutants. Further, some of the interactions described as "synergistic" appear to be additive, not synergistic. While additive effects can be used as evidence that two genes work in different pathways, this can also be very misleading, especially when the function of individual proteins is unknown. I find that the classification of genes into "epistastasis groups" based on this analysis does not shed light on their functions and indeed seems in some cases to contradict what is known about their functions.

      The yeast two-hybrid (Y2H) data are only presented as a single colony. While it is understandable to use a 'representative' colony, it is ideal to include a dilution series for the various interactions, which is how Y2H data are typically shown.

      Additional (relatively minor) concerns about these data:

      (1) Several interactions reported here seem to be detected in only one direction - e.g., MRE-11-AD/HIM-5-BD, REC-1-AD/XND-1-BD, and XND-1-AD/HIM-17-BD - while no interactions are seen with the reciprocal pairs of fusion proteins. I'm not sure if some of this is due to pasting "positive" colony images into the wrong position in the grid, but this should be addressed.

      (2) DSB-3 was only assayed in pairwise combinations with a subset of other proteins; this should be explained; it is also unclear why the interaction grids are not symmetrical about the diagonal.

      (3) I don't understand why the graphic summaries of Y2H data are split among 3 different figures (1, 2, and 3).

    1. eLife assessment

      Using experiments in the white fly, this manuscript provides evidence that the bacterial symbiont Wolbachia can be transmitted from parasitoid wasps to their insect hosts. Characterizing the transfer of Wolbachia between insect species is a valuable attempt to explain the widespread of this intracellular bacterium. This paper is incomplete as it does not furnish sufficient data to support several of its claims for which additional methods and data are necessary.

    2. Reviewer #1 (Public Review):

      Summary and Strengths:

      The ability of Wolbachia to be transmitted horizontally during parasitoid wasp infections is supported by phylogenetic data here and elsewhere. Experimental analyses have shown evidence of wasp-to-wasp transmission during coinfection (eg Huigins et al), host to wasp transmission (eg Heath et al), and mechanical ('dirty needle') transmission from host to host (Ahmed et al). To my knowledge this manuscript provides the first experimental evidence of wasp to host transmission. Given the strong phylogenetic pattern of host-parasitoid Wolbachia sharing, this may be of general importance in explaining the distribution of Wolbachia across arthropods. This is of interest as Wolbachia is extremely common in the natural world and influences many aspects of host biology.

      Weaknesses:

      The first observation of the manuscript is that the Wolbachia strains in hosts are more closely related to those in their parasitoids. This has been reported on multiple occasions before, dating back to the late 1990s. The introduction cites five such papers (the observation is made in other studies too that could be cited) but then dismisses them by stating "However, without quantitative tests, this observation could simply reflect a bias in research focus." As these studies include carefully collected datasets that were analysed appropriately, I felt this claim of novelty was rather strong. It is unclear why downloading every sequence in GenBank avoids any perceived biases, when presumably the authors are reanalysing the data in these papers.

      I do not doubt the observation that host-parasitoid pairs tend to share related Wolbachia, as it is corroborated by other studies, the effect size is large, and the case study of whitefly is clearcut. It is also novel to do this analysis on such a large dataset. However, the statistical analysis used is incorrect as the observations are pseudo-replicated due to phylogenetic non-independence. When analysing comparative data like this it is essential to correct for the confounding effects of related species tending to be similar due to common ancestry. In this case, it is well-known that this is an issue as it is a repeated observation that related hosts are infected by related Wolbachia. However, the authors treat every pairwise combination of species (nearly a million pairs) as an independent observation. Addressing this issue is made more complex because there are both the host and symbiont trees to consider. The additional analysis in lines 123-124 (including shuffling species pairs) does not explicitly address this issue.

      The sharing of Wolbachia between whitefly and their parasitoids is very striking, although this has been reported before (eg the authors recently published a paper entitled "Diversity and Phylogenetic Analyses Reveal Horizontal Transmission of Endosymbionts Between Whiteflies and Their Parasitoids"). In Lines 154-164 it is suggested that from the tree the direction of transfer between host and parasitoid can be inferred from the data. This is not obvious to me given the poor resolution of the tree due to low sequence divergence. There are established statistical approaches to test the direction of trait changes on a tree that could have been used (a common approach is to use the software BEAST).

    3. Reviewer #2 (Public Review):

      The paper by Yan et al. aims to provide evidence for horizontal transmission of the intracellular bacterial symbiont Wolbachia from parasitoid wasps to their whitefly hosts. In my opinion, the paper in its current form consists of major flaws.

      Weaknesses:

      The dogma in the field is that although horizontal transmission events of Wolbachia occur, in most systems they are so rare that the chances of observing them in the lab are very slim.<br /> For the idea of bacteria moving from a parasitoid to its host, the authors have rightfully cited the paper by Hughes, et al. (2001), which presents the main arguments against the possibility of documenting such transmissions. Thus, if the authors want to provide data that contradict the large volume of evidence showing the opposite, they should present a very strong case.

      In my opinion, the paper fails to provide such concrete evidence. Moreover, it seems the work presented does not meet the basic scientific standards.

      My main reservations are:

      - I think the distribution pattern of bacteria stained by the probes in the FISH pictures presented in Figure 4 looks very much like Portiera, the primary symbiont found in the bacterium of all whitefly species. In order to make a strong case, the authors need to include Portiera probes along with the Wolbachia ones.

      - If I understand the methods correctly, the phylogeny presented in Figure 2a is supposed to be based on a wide search for Wolbachia wsp gene done on the NCBI dataset (p. 348). However, when I checked the origin of some of the sequences used in the tree to show the similarity of Wolbachia between Bemisia tabaci and its parasitoids, I found that most of them were deposited by the authors themselves in the course of the current study (I could not find this mentioned in the text), or originated in a couple of papers that in my opinion should not have been published to begin with.

      - The authors fail to discuss or even acknowledge a number of published studies that specifically show no horizontal transmission, such as the one claimed to be detected in the study presented.

    4. Reviewer #3 (Public Review):

      This is a very ordinary research paper. The horizontal of endosymbionts, including Wolbachia, Rickettsia etc. has been reported in detail in the last 10 years, and parasitoid vectored as well as plant vectored horizontal transmission is the mainstream of research. For example, Ahmed et al. 2013 PLoS One, 2015 PLoS Pathogens, Chiel et al. 2014 Enviromental Entomology, Ahmed et al. 2016 BMC Evolution Biology, Qi et al. 2019 JEE, Liu et al. 2023 Frontiers in Cellular and Infection Microbiology, all of these reported the parasitoid vectored horizontal transmission of endosymbiont. While Caspi-Fluger et al. 2012 Proc Roy Soc B, Chrostek et al. 2017 Frontiers in Microbiology, Li et al. 2017 ISME Journal, Li et al. 2017 FEMS, Shi et al. 2024 mBio, all of these reported the plant vectored horizontal transmission of endosymbiont. For the effects of endosymbiont on the biology of the host, Ahmed et al. 2015 PLoS Pathogens explained the effects in detail.

      Weaknesses:

      In the current study, the authors downloaded the MLST or wsp genes from a public database and analyzed the data using other methods, and I think the authors may not be familiar with the research progress in the field of insect symbiont transmission, and the current stage of this manuscript lacking sufficient novelty.

    1. eLife assessment

      This manuscript presents experiments that address the question of whether the lateral parafacial area (pFL) is active in controlling active expiration, which is particularly significant in patient populations that rely on active exhalation to maintain breathing (eg, COPD, ALS, muscular dystrophy). This study presents solid evidence for a valuable finding of pharmacological mapping of the core medullary region that contributes to active expiration and addresses the question of where these regions lie anatomically. Results from these experiments will be of value to those interested in the neural control of breathing and other neuroscientists as a framework for how to perform pharmacological mapping experiments in the future.

    2. Reviewer #1 (Public Review):

      The main focus of the current study is to identify the anatomical core of an expiratory oscillator in the medulla using pharmacological disinhibition. Although expiration is passive in normal eupneic conditions, activation of the parafacial (pFL) region is believed to evoke active expiration in conditions of elevated ventilatory demands. The authors and others in the field have previously attempted to map this region using pharmacological, optogenetic and chemogenetic approaches, which present with their own challenges.

      In the present study, the authors take a systematic approach to determine the precise anatomical location within the ventral medulla's rostro-caudal axis where the expiratory oscillator is located. The authors used a bicuculline (a GABA-A receptor antagonist) and fluorobeads solution at 5 distinct anatomical locations to study the effects on neuronal excitability and functional circuitry in the pFL. The effects of bicuculline on different phases of the respiratory cycle were characterized using a multidimensional cycle-by-cycle analysis. This analysis involved measuring the differences in airflow, diaphragm electromyography (EMG), and abdominal EMG signals, as well as using a phase-plane analysis to analyze the combined differences of these respiratory signals. Anatomical immunostaining techniques were also used to complement the functional mapping of the pFL.

      Major strengths of this work include a robust study design, complementary neurophysiological and immunohistochemical methods and the use of a novel phase-plane analysis. The authors construct a comprehensive functional map revealing functional nuances in respiratory responses to bicuculline along the rostrocaudal axis of the parafacial region. They convincingly show that although bicuculline injections at all coordinates of the pFL generated an expiratory response, the most rostral locations in the lateral parafacial region play the strongest role in generating active expiration. These were characterized by a strong impact on the duration and strength of ABD activation, and a robust change in tidal volume and minute ventilation. The authors also confirmed histologically that none of the injection sites overlapped grossly with PHOX2B+ neurons, thus confirming the specificity of the injections in the pFL and not the neighboring RTN.

      Although a caveat of the approach is that bicuculine injections have indiscriminate effects on other neuronal populations in the region (GABAergic, glycinergic, and glutamatergic), the results can largely be interpreted as modulation of neuronal populations in different regions of the pFL have differential effects on expiratory output. This limitation of the pharmacological approach has also been aptly discussed by the authors.

      Collectively, these findings advance our understanding of the presumed expiratory oscillator, the pFL, and highlight the functional heterogeneity in the functional response of this anatomical structure.

    3. Reviewer #3 (Public Review):

      Summary:<br /> The study conducted by Pisanski et al investigates the role of the lateral parafacial area (pFL) in controlling active expiration. Stereotactic injections of bicuculline were utilized to map various pFL sites and their impact on respiration. The results indicate that injections at more rostral pFL locations induce the most robust changes in tidal volume, minute ventilation, and combined respiratory responses. The study indicates that the rostro-caudal organization of the pFL and its influence on breathing is not simple and uniform.

      Strengths:<br /> The data provide novel insights into the importance of rostral locations in controlling active expiration. The authors use innovative analytic methods to characterize the respiratory effects of bicuculline injections into various areas of the pFL.

      Weaknesses:<br /> Bicuculline injections increase the excitability of neurons. Aside of blocking GABA receptors, bicuculline also inhibits calcium-activated potassium currents and potentiates NMDA currents, thus insights into the role of GABAergic inhibition are limited.<br /> Increasing the excitability of neurons provides little insights into the activity pattern and function of the activated neurons. Without recording from the activated neurons, it is impossible to know whether an effect on active expiration or any other respiratory phase is caused by bicuculline acting on rhythmogenic neurons or tonic neurons that modulate respiration. While this approach is inappropriate to study the functional extent of the conditional "oscillator" for active expiration, it still provides valuable insights into this region's complex role in controlling breathing .

    4. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      This manuscript presents a solid and generally convincing set of experiments to address the question of whether the lateral parafacial area (pFL) is active in controlling active expiration, which is particularly important in patient populations that rely on active exhalation to maintain breathing (eg, COPD, ALS, muscular dystrophy). This study presents a valuable finding by pharmacologically mapping the core medullary region that contributes to active expiration and addresses the question of where these regions lie anatomically. Results from these experiments will be of value to those interested in the neural control of breathing and other neuroscientists as a framework for how to perform pharmacological mapping experiments in the future.

      Thanks for the positive feedback on our study, as well as the assessment of the novelty of our investigation and the advancements to the field that these results will bring in the future.

      We have addressed the specific comments and made changes to the manuscript as indicated below.

      Public Reviews:

      Reviewer #1 (Public Review):

      The main focus of the current study is to identify the anatomical core of an expiratory oscillator in the medulla using pharmacological disinhibition. Although expiration is passive in normal eupneic conditions, activation of the parafacial (pFL) region is believed to evoke active expiration in conditions of elevated ventilatory demands. The authors and others in the field have previously attempted to map this region using pharmacological, optogenetic, and chemogenetic approaches, which present their own challenges.

      In the present study, the authors take a systematic approach to determine the precise anatomical location within the ventral medulla's rostrocaudal axis where the expiratory oscillator is located. The authors used a bicuculline (a GABA-A receptor antagonist) and fluorobeads solution at 5 distinct anatomical locations to study the effects on neuronal excitability and functional circuitry in the pFL. The effects of bicuculline on different phases of the respiratory cycle were characterized using a multidimensional cycle-by-cycle analysis. This analysis involved measuring the differences in airflow, diaphragm electromyography (EMG), and abdominal EMG signals, as well as using a phase-plane analysis to analyze the combined differences of these respiratory signals. Anatomical immunostaining techniques were also used to complement the functional mapping of the pFL.

      Major strengths of this work include a robust study design, complementary neurophysiological and immunohistochemical methods, and the use of a novel phase-plane analysis. The authors construct a comprehensive functional map revealing functional nuances in respiratory responses to bicuculline along the rostrocaudal axis of the parafacial region. They convincingly show that although bicuculline injections at all coordinates of the pFL generated an expiratory response, the most rostral locations in the lateral parafacial region play the strongest role in generating active expiration. These were characterized by a strong impact on the duration and strength of ABD activation and a robust change in tidal volume and minute ventilation. The authors also confirmed histologically that none of the injection sites overlapped grossly with PHOX2B+ neurons, thus confirming the specificity of the injections in the pFL and not the neighboring RTN.

      Collectively, these findings advance our understanding of the presumed expiratory oscillator, the pFL, and highlight the functional heterogeneity in the functional response of this anatomical structure.

      Thanks for the positive feedback on the results presented in the current manuscript.

      Reviewer #2 (Public Review):

      Summary:

      Pisanski and colleagues map regions of the brainstem that produce the rhythm for active expiratory breathing movements and influence their motor patterns. While the neural origins of inspiration are very well understood, the neural bases for expiration lag considerably. The problem is important and new knowledge pertaining to the neural origins of expiration is welcome.

      The authors perturb the parafacial lateral (pFL) respiratory group of the brainstem with microinjection of bicuculline, to elucidate how disinhibition in specific locations of the pFL influences active expiration (and breathing in general) in anesthetized rats. They provide valuable, if not definitive, evidence that the borders of the pFL appear to extend more rostrally than previously appreciated. Prior research suggests that the expiratory pFL exists at the caudal pole of the facial cranial nucleus (VIIc). Here, the authors show that its borders probably extend as much as 1 mm rostral to VIIc. The evidence is convincing albeit with caveats.

      Strengths:

      The authors achieve their aim in terms of showing that the borders of the expiratory pFL are not well understood at present and that it (the pFL) extends more rostrally. The results support that point. The data are strong enough to cause many respiratory neurobiologists to look at the sites rostral to the VIIc for expiratory rhythmogenic neurons and characterize their properties and mechanisms. At present my view is that most respiratory neurobiologists overlook the regions rostral to VIIc in their studies of expiratory rhythm and pattern.

      Weaknesses:

      The injection of bicuculline has indiscriminate effects on excitatory and inhibitory neurons, and the parafacial region is populated by excitatory neurons that are expiratory rhythmogenic and GABA and glycinergic neurons whose roles in producing active expiration are contradictory (Flor et al. J Physiol, 2020, DOI: 10.1113/JP280243). It remains unclear how the microinjections of bicuculline differentially affect all three populations. A more selective approach would be able to disinhibit the populations separately. Nevertheless, for the main point at hand, the data do suggest that we should reconsider the borders of the expiratory pFL nucleus and begin to examine its physiology up to 1 mm rostral to VIIc.

      The control experiment showed that bicuculline microinjections induced cFos expression in the pFL, which is good, but again we don't know which neurons were disinhibited: glutamatergic, GABAergic, or glycinergic.

      Thanks for sharing your excitement on the results of our study, and appreciating the thorough investigation performed with the use of bicuculline, an approach that was originally used in Pagliardini et al, 2011, PMID: 21414911) and then used by many other groups to generate and study active expiration in vivo.

      In the current study we used the well known effect of Bicuculline to systematically test the area that is more sensitive to such a pharmacological effect, and hence may be the core for generating active expiration. While the use of GABA receptor antagonists may have an indiscriminate effect on GABA receptor expressing neurons with various phenotypes, anatomical assessment of inhibitory cells has shown very little distribution of GABAergic and glycinergic cells in the parafacial area (Tanaka et.al, 2003; PMID: 14512139) and it has been inferred in multiple publications (Huckstepp et al., 2015, PMID: 25609622; Huckstepp et al. 2016 PMID: 27300271; Huckstepp et al., 2018, PMID: 30096151; Flor et al., 2020, PMID: 32621515; Britto & Moraes, 2017; PMID: 28004411; Silva et al. 2016; PMID: 26900003) and demonstrated recently (Magalhaes et al.,  2021; PMID: 34510468) that late-E neurons in the parafacial region are excitatory and have a glutamatergic phenotype. We can’t exclude that a small fraction of neurons in the pFL area are inhibitory, and that they could influence recruitment of adjacent late-E expiratory neurons. A more selective activation of neuronal populations with different phenotype would be indeed interesting, nonetheless, if local inhibitory neurons have a role in the generation of active expiration, then their disinhibition could have either an inhibitory effect on late-E activity or stimulate expiration in a more indirect fashion.

      While the effect of bicuculline on active expiration has been reported and replicated in multiple manuscripts, the source of inhibition across different phases of the respiratory cycle is still under investigation. Some studies suggest that GABAergic and glycinergic inhibition is not originated in pFL but rather in the BötC and preBötC areas (Flor et al., 2020, PMID: 32621515; Magalhaes et al., 2021; PMID: 34510468) and the effects of this inhibition across the respiratory cycle is debated. Future studies will be key to identify the source of pFL inhibition.

      The manuscript characterizes how bicuculline microinjections affect breathing parameters such as tidal volume, frequency, ventilation, inspiratory and expiratory time, as well as oxygen consumption. Those aspects of the manuscript are a bit tedious and sometimes overanalyzed. Plus, there was no predictive framework established at the outset for how one should expect disinhibition to affect breathing parameters. In other words, if the authors are seeking to map the pFL borders, then why analyze the breathing patterns so much? Does doing so provide more insight into the borders of pFL? I did not think it was compellingly argued.

      We have edited the introduction to address this comment and emphasize the rationale for the study. We also edited the results section to summarize our findings.

      We continue to report our in-depth analysis of the perturbations induced by bicuculline injection over the various respiratory characteristics as this will be fundamental to determine the effects of our experiment not only on the activation of pFL and active expiration, but also on the respiratory network in general. In order to be fair and open about our findings we have reported the results of our analysis in detail. Of note, all sites generated active expiration, but since the objective of the study was to determine the sites with the most significant changes, a finer and multilevel analysis has been used.

      Further, lines 382-386 make a point about decreasing inspiratory time even though the data do not meet the statistical threshold. In lines 386-395, the reporting appears to reach significance (line 388) but not reach significance (line 389). I had trouble making sense of that disparity.

      The statistics were confirmed, and the lines edited as follows: “Interestingly, the duration of inspiration during the response was found to decrease in all groups relative to baseline respiration (Ti response = 0.279 ± 0.034s, Ti baseline = 0.318 ± 0.043s, Wilcoxon rank sum: Z = 3.24, p = 0.001). Contrary to this decrease in inspiratory duration, the total expiratory time was observed to increase in all groups and remained elevated compared to baseline (TE response = 1.313 ± 0.188s, TE baseline = 1.029 ± 0.161s, Wilcoxon rank sum: Z = 4.49, p = 0.001).”

      The other statistical hiccups include "tended towards significance" (line 454), "were found to only reach significance for a short portion of the response" (line 486-7), "did not reach the level of significance" (line 506), which gives one the sense of cherry picking or over-analysis. Frankly, this reviewer finds the paper much more compelling when just asking whether the microinjections evoke active expiration. If yes, then the site is probably part of the pFL.

      Statistical “tendencies” have been eliminated throughout the manuscript.

      We have analyzed in details our results in order to determine changes and differential effects on respiration when comparing the 5 sites of injections. Although the presentation of the results may seem tedious, it has allowed us to highlight some interesting effects: first, the effects on respiratory frequency. It has been shown in the past that optogenetic stimulation of this area causes an increase in respiratory frequency (Pagliardini et al., 2011, PMID: 21414911), whereas a dishinibition with this same approach or stimulation of AMPAreceptor in pFL have shown a reduction in frequency or not a significant change in the response (Pagliardini et al., 2011, PMID: 21414911; Huckstepp et al., 2015, PMID: 25609622; Huckstepp et al. 2016 PMID: 27300271; Huckstepp et al., 2018, PMID: 30096151). Here, we suggest that the reduction in respiratory frequency is observed only in the caudal sites and could be attributed to BötC effects rather than the stimulation of the core of the pFL since no respiratory change was observe where the effect was more potent (rostral side). Another interesting point was the effects on O2 consumption, although difficult to interpret at this point, we found very interesting that hyperventilation occurred only at the most rostral injection sites.

      I encourage the authors to consider the fickleness of p-values in general and urge them to consider not just p but also effect size.

      Thank you for the feedback on our description of the statistical results and the suggestion of incorporating effect size. We have now included measurements of effect size in the results section.  Specifically, we calculated the effect size within each ANOVA using the value of eta squared for all data shown in Figures 3 and 4. Please note that in our phase-plane analysis (Fig. 5-6) the Mahalanobis distance is itself an effect size measure for multidimensional data. We also note that statistical evaluation using non-parametric analyses do not involve effect sizes.

      Reviewer #3 (Public Review):

      Summary:

      The study conducted by Pisanski et al investigates the role of the lateral parafacial area (pFL) in controlling active expiration. Stereotactic injections of bicuculline were utilized to map various pFL sites and their impact on respiration. The results indicate that injections at more rostral pFL locations induce the most robust changes in tidal volume, minute ventilation, and combined respiratory responses. The study indicates that the rostrocaudal organization of the pFL and its influence on breathing is not simple and uniform.

      Strengths:

      The data provide novel insights into the importance of rostral locations in controlling active expiration. The authors use innovative analytic methods to characterize the respiratory effects of bicuculline injections into various areas of the pFL.

      Weaknesses:

      Bicuculline injections increase the excitability of neurons. Aside from blocking GABA receptors, bicuculline also inhibits calcium-activated potassium currents and potentiates NMDA current, thus insights into the role of GABAergic inhibition are limited.

      Increasing the excitability of neurons provides little insights into the activity pattern and function of the activated neurons. Without recording from the activated neurons, it is impossible to know whether an effect on active expiration or any other respiratory phase is caused by bicuculline acting on rhythmogenic neurons or tonic neurons that modulate respiration. While this approach is inappropriate to study the functional extent of the conditional "oscillator" for active expiration, it provides valuable insights into this region's complex role in controlling breathing.

      We have included a reflection of the weaknesses of our studies in the technical consideration section to address the possibility that bicuculline may induce active expiration through other mechanisms. Please note that the use of bicuculline was not to gain further insight on GABAergic inhibition of pFL but to adopt a tool to generate active expiration that has been extensively validated by our group and others.

      Multiple studies have shown recruitment of excitatory late expiratory neurons with bicuculline injections. Although we did not record from late-E neurons in this study, we infer from the body of literature that disinhibition of neurons in this area will activate late-E neurons (as previously demonstrated) and generate active expiration. Although we see value in recording activity of single neurons (especially to study mechanisms of rhythmogenesis), we opted to measure the physiological response from respiratory muscles as an indication of active expiration recruitment in vivo. Recording from single neurons after bicuculline injections in each site would confirm the presence of expiratory neurons along the parafacial area, which is probably not surprising, since every site tested promoted active expiration. The focus of the study though was to determine the site with the strongest physiological response to disinhibition. Future studies will be key to determine whether all neurons along this column have similar electrophysiological rhythmic properties to the ones recently reported (Magalhaes et al., 2021; PMID: 34510468), or some of them simply provide tonic drive to late-E neurons located elsewhere.

      We have discussed the issue as follows:

      “Our experiments focused on determining the area in the pFL that is most effective in generating active expiration as measured by ABD EMG activity and expiratory flow. We did not attempt to record single cell neuronal activity at various locations as previously shown in other studies (Pagliardini et al 2011; Magalhaes et al., 2021), as this approach would most likely find some late-E neurons across the pFL and thus not effectively discriminate between areas of the pFL. Future studies involving multi-unit recordings or imaging of cell population activities will help to determine the firing pattern and population density of bicuculline-activated cells and further determine differences in distribution and function of late-E neurons across the region of the pFL.”

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Overall, the manuscript addresses an important question in the field, the anatomical location of the expiratory oscillator. I commend the authors for a well-thought-out and clearly presented study. However, a few small concerns deserve attention to improve the clarity of the report.

      (1) The figures would benefit from a rostral-to-caudal representation of results instead of a caudal-to-rostral orientation. Example, Figure 2.

      We opted for a caudal to rostral representation to progressively move away from the inspiratory oscillator (preBötC) and the anatomical reference point (the caudal tip of the facial nucleus) with our series of injections. 

      (2) A discussion about how expiratory responses generated by these pharmacological approaches would compare to endogenous baseline conditions. The authors mention that bicuculline injections elicited a late-E downward inflection that was absent in baseline conditions. Thus, this raises the point of how these findings compare to awake freely moving animals or during different conditions of increased ventilatory demand.

      This is an interesting question that has not yet been address in the field. As far as we know, there are no recordings of pFL neurons in freely behaving animals although recordings of pFL late-E neurons under elevated PaCO2 have shown a late-E activity in in situ preparations (Britto & Moraes, 2017; PMID: 28004411; Magalhaes et al., 2021; PMID: 34510468).

      We have clarified this in the discussion as follows:

      “At rest, respiratory activity does not present with active expiration (i.e, expiratory flow below its functional residual capacity in conjunction with expiratory-related ABD muscle recruitment) and expiratory flow occurs due to passive recoil of chest wall with no contribution of abdominal activity. Active expiration and abdominal recruitment can be spontaneously observed during sleep (in particular REM sleep, Andrews and Pagliardini, 2015; Pisanski et al., 2019) and can be triggered during increased respiratory drive (e.g. Hypercapnia, RTN stimulation, Abbott et al., 2011). Although never assessed in freely moving, unanesthetized rodents, bicuculline has been extensively used to generate active expiration and late-E neuron activity in both juvenile and adult anesthetized rats (Pagliardini et al., 2011; Huckstepp et al., 2015 Huckstepp et al., 2016; Huckstepp et al., 2018; De Britto and Moraes, 2017; Magalhaes et al., 2021). “

      (3) In Figure 2A, there appears to be an injection site in the top right quadrant of the image, very distant from the intended site. Could the authors confirm if this is an artifact?

      Yes, it is an artifact of image acquisition, we should have marked that in the figure. To avoid confusion and follow other reviewers’ suggestions we have edited he figure.

      (4) A stylistic suggestion would be to include the subpanel of Figure 2C saline control injection as a graph of its own and also include the control anatomical location in 2B.

      Thanks for the suggestion. Because of the complex organization of the figure we opted to leave it as a subpanel in order to not distract the reader from the 5 injection sites, but still provide information about vehicle injection and their lack of changes in respiratory response.

      (5) The authors note that DIAm Area (norm.) during the inspiratory phase is increased in the +6 and +8mm groups. However, Figure 5E shows that the +8mm group is significantly reduced as compared to the +6mm group. Please clarify.

      During the inspiratory phase we did not observe any significant change in the DIA Area (norm.). We realize that the description of this part of the results was confusing and therefore we have eliminated that section.

      Reviewer #2 (Recommendations For The Authors):

      I encourage the authors to consider the fickleness of p-values in general and urge them to consider not just p but also effect size. There is a valuable editorial in this week's J Physiology (https://doi.org/10.1113/JP285575) that may provide helpful guidance.

      Thanks for this comments and the general assessment. We realized that the results section was dense and with a lot of information. We significantly slimmed the description of the results in order to facilitate the appreciation of the results and avoid confounding statement about significant vs non- significant results.

      We have now included measurements of effect size in the results section.  Specifically, we calculated the effect size within each ANOVA using the value of eta squared for all data shown in Figures 3 and 4. Please note that in our phase-plane analysis (Fig. 5-6) the Mahalanobis distance is itself an effect size measure for multidimensional data. We also note that statistical evaluation using non-parametric analyses do not involve effect sizes.

      The equipment and resources should be clearly identified and use RRIDs whenever possible. Resources like antibodies and other reagents (e.g., cryoprotectants) should be identified, not just by manufacturer, but also by specific part or product numbers or identifiers.

      Manuscript has been edited to add these details.

      The manuscript makes reference to ImageJ and Matlab routines, which must be public through GitHub or another stable repository.

      Thanks for pointing this out. Image J analysis has been performed following scripts already available to users (no custom scripts). The Matlab scripts used for the multivariate analysis is now available at: https://github.com/mprosteb/Pisanski2024

      The way that ABD-DIA coupling was assessed was unclear from the Methods.

      The following text has been added to the methods: “The coupling between ABD and DIA signals was measured as a ratio and analyzed by quantifying the number of bursts of activity observed for the ABD and DIA EMG signals during the first 10 minutes of the response, excluding time bins at end of the response (due to fading and waning of the ABD response in those instances).”

      Fig. 1A was never cited in the text.

      It has been cited now.

      Fig. 1A-C appears to be exactly the same as Fig. 5A-C.

      The reviewer is correct. We have used figure 1 to describe and explain our analytical methods with sample data and Figure 5 describes our results. We have clarified that in: “Figure 5: Rostral injections elicit more prominent changes to respiration in each signal and sub-period. A-C: Is the same as Method Figure 1, has been included here for further clarity when analyzing the results.”

      Late Expiratory airflow is given in units of volts (V) in lines 358-363 (Fig. 4C) but then in units of volts-seconds (V•s) in lines 363-367. Both units are problematic because the voltage is neither an air volume nor an air volume per unit time. Is there some conversion factor left out?

      In this section of the results we describe the changes in expiratory peak amplitude (V) and expiratory peak flow (V•s). Since calibration of airflow was performed on the positive flow and for larger volumes, we prefer to use the original units to guarantee precise assessment of the change and avoid introducing potential errors. Since the analysis considers changes from baseline readings, converting to ml or ml*s would not affect our analysis.

      Reviewer #3 (Recommendations For The Authors):

      The study conducted by Pisanski et al investigates the role of the lateral parafacial area (pFL) in respiratory control, specifically in modulating active expiration. The precise location of this expiratory oscillator within the ventral medulla remains uncertain, with some studies indicating that the caudal tip of the facial nucleus (VIIc) forms the core while others propose more rostral areas. Bicuculline injections were utilized at various pFL sites to explore the impact of these injections on respiration. The authors use innovative and impressive analytic methods to characterize the effect on respiratory activity. The results indicate that injections at more rostral pFL locations induce the most robust changes in tidal volume, minute ventilation, and combined respiratory responses. The study will contribute to an enhanced understanding of the neural mechanisms controlling active expiration. The main message of the study is that the rostro-caudal organization of the pFL is not simple and uniform. The data provides novel insights into the importance of rostral locations in controlling active expiration (see e.g. lines 738-740).

      The data and results of the paper are intriguing, and it appears that the experiments are well-managed and executed. However, there are several major and minor comments and suggestions that should be addressed by the authors:

      (1) The study relies heavily on local injections into specific areas that are confirmed histologically. One potential concern is the injection volume of 200 nL in such a tiny area. The authors suggest that the drug did not spread to rostral/caudal areas outside the specified coordinate partly based on their cFOS staining. For example, the lack of cFOS activation in TH+ cells and Phox2B cells is interpreted as proof that bicuculline did not spread to these somas (Figure 2). The authors seem to use a similar argument as evidence that the pFL does not include Phox2B neurons in the RTN as discussed in the Discussion section (lines 830-847). However, it is very surprising that bicuculline injections into an area that is known to contain Phox2B and Th+ neurons do not activate these neurons as assessed by the cFOS staining. It seems puzzling to me that none of their injections shown in Figure 2 activated Phox2B or Th neurons. I assume that in targeting the pFL the authors must have sometimes hit areas that included neurons that define the RTN, which would have activated Phox2B or Th+ neurons. Did the authors find that these activations did not activate active expiration? Such negative "controls" would strengthen their argument that pFL is a separate and distinct region that selectively controls active expiration.

      Thanks for the positive feedback on the manuscript. As it has been demonstrated and discussed in several previous publications, PHOX2B expressing neurons in this area of the brain are part of the RTN Neuromedin B positive neurons (more densely located in the ventral paraFacial rather than the lateral parafacial, our site of injection), the TH+ C1 neurons (located in a somewhat more caudal and medial position compared to our sites of injection, around the BötC/ preBötC area) and the large Facial MN (easily identifiable by their large size and compact location). Given this differential spatial distribution, and the controls described below, we believe we have reduced the possibility of the direct activation of these neurons, although we can’t exclude it in full.

      There is now strong evidence about lack of PHOX2B expression in late E neuron in juvenile and adult rats (Magalhaes et al., 2021; PMID: 34510468). We realize that the microinjected solution could potentially diffuse in the brain and hit other areas, but we combined two strategies to verify our intention for a focal injection activating only a restricted area of the brain (i.e., the pFL): i) localization of fluorobeads that were diluted in the Bicuculline solution; ii) expression of cFos combined with anatomical markers, to identify activated cells. Fluorobeads have a very limited spread in the brain and therefore informed us of the site of the injection to differentiate between the five injections locations. Although we can’t assume that Bicuculline will have a similar spread (and it will also be quickly degraded in the tissue), the combination of this analysis with the localized expression of cFos cells has helped us to differentiate between injections site. Because of the proximity of PHOX2B cells in RTN and C1 neurons, we also combined cFos expression with immunohistochemistry to determine whether bicuculline activation was also visible in these two neuronal populations. Our results indicate that there is baseline cfos activity in RTN neurons (see vehicle injection) but the fraction of PHOX2B activated cells did not increase with bicuculline injections suggesting that these neurons were not the target of our injections. Please note that cfos expression has been extensively used to determine RTN neuron activation, especially following chemoreflex responses. 

      (2) The authors refer to "the expiratory oscillator" throughout the manuscript (e.g. lines 58, 62, 65) as if there is only one expiratory oscillator i.e. "the expiratory oscillator". For some reason, the authors avoided citing and mentioning PiCo (Anderson et al. 2016), which is considered the oscillator for postinspiration. Since the present study focuses on the role of expiration, and since the authors describe convincing effects on postinspiration, considering this oscillator which is located dorsomedial to the VRC seems relevant for the present study.

      Due to the limited and controversial literature that is currently present describing Pico as a third oscillator and the fact that our studies do not directly assess the post-inspiratory activity (as measure by the V nerve or laryngeal muscles) or Pico activity and location (which would be even more distant than the RTN, for example), we prefer to avoid commenting on the effects of this injection on Pico or the connectivity between Pico and pFL.

      We have added this to the discussion:

      “Therefore, although it has previously been described, it is currently unknown the exact mechanism by which this post-I activity in the ABD muscles is generated. For example the interplay between the rostral pFL and brainstem structures generating post-inspiratory activity, such as the proposed post-inspiratory oscillator (PiCo; Anderson et al., 2016) or pontine respiratory networks, could be reasonably involved in this process.”

      (3) The authors do not specify what type of bicuculline they injected. Bicuculline is known to have significant effects on potassium channels. Thus, the effects reported here could be due to a non-specific change in excitability, rather than caused by a specific GABAergic blockade.

      The authors also do not know what effects these injections cause in the neurons in vivo, since the injections are not accompanied by recordings from the respiratory neurons that they activate. This together with the non-specific bicuculline effects will affect the interpretation of the results. Thus, the authors need to be more careful when interpreting their effects as "GABAergic". The use of more specific blockers like gabazine could partly address this concern. The authors have to discuss this in a "limitation section".

      Thanks for pointing that out, we have now clarified in the methods section that we used bicuculline methochloride. We can’t exclude that some side- effects could be present due to the use of this drug. For the purpose of this study though, we focused on using bicuculline as a tool to consistently generate active expiration since it has been extensively used by multiple laboratories to induce abdominal muscle recruitment and active expiration, as well as to directly record late-E neurons in this same area.

      We have included in the discussion the following statement:

      “Technical considerations

      Bicuculline methiodide has previously been observed to exhibit inhibitory effects on Ca2+ activated K+ currents inducing non-specific potentiation of NMDA currents (Johnson and Seutin, 1997). Consequently, caution is warranted in attributing our findings solely to the GABAa antagonist properties of bicuculline. Previous work has demonstrated a temporal correlation between the onset of late-E neuron activity in the caudal parafacial region and ABD activity in response to bicuculline (Pagliardini et al., 2011; de Britto and Moraes, 2017; Magalhaes et al., 2021) as well as GABAergic sIPSCs in late-E neurons (Magalhaes et al., 2012). However, it is essential to note that the current study lacks single unit recording, preventing us from definitively confirming whether the observed activity stems from late-E neuronal GABAergic dishinibition or excitation through non GABAergic mechanisms.”

      (4) I also caution the authors when stating that the bicuculline injections will reveal the precise location and functional boundaries of "the" expiratory oscillation within the pFL. Increasing the excitability with bicuculline is inappropriate to study the functional boundaries of an oscillator. It is particularly inappropriate to identify the boundaries of the pFL, a network that is normally inactive and activated only under certain behavioral and metabolic conditions. Because the injections are increasing the neuronal excitability unspecifically, and because the authors are not recording the activity of the neurons in the pFL region it is unclear what kind of neurons are activated. The cFOS staining may help to define whether these neurons are Phox2B or Th positive or negative, but they will not provide insights into the activity patterns of the activated neurons. Thus, it is fair to assume that these injections will likely include also tonic neurons that might indirectly control the activity of pFL neurons under certain metabolic or behavioral conditions without actually being involved in the rhythmogenesis of active expiration. Many of the effects peak after several minutes, and different regions cause differential effects with different time courses, which is difficult to interpret functionally. Thus, the "core" identified in the present study could consist of tonic neurons as opposed to rhythmic neurons generating active expiration.

      We agree with the reviewer that our local injections may have activated an heterogeneous population of neurons. We do not claim that we only activated late-E rhythmogenic neurons but that our multiple sites of injections revealed the area that is generating the strongest excitation of ABD muscles and active expiration.

      While the use of GABA receptor antagonists may have an indiscriminate effect on GABA receptor expressing neurons with various phenotypes, anatomical assessment of inhibitory cells has shown very little distribution of GABAergic and glycinergic cells in the parafacial area (Tanaka et.al, 2003; PMID: 14512139) and it has been inferred in multiple publications (Huckstepp et al., 2015, PMID: 25609622; Huckstepp et al. 2016 PMID: 27300271; Huckstepp et al., 2018, PMID: 30096151; Flor et al., 2020, PMID: 32621515; Britto & Moraes, 2017; PMID: 28004411; Silva et al. 2016; PMID: 26900003) and demonstrated recently (Magalhaes et al.,  2021; PMID: 34510468) that late-E neurons in the parafacial region are excitatory and have a glutamatergic phenotype

      As suggested by the reviewer, it is possible that the bicuculline injection may have activated some tonic non rhythmogenic neurons which could activate the expiratory oscillator located elsewhere.

      We have edited the introduction as follows:

      “By strategically administering localized volumes of bicuculline at multiple rostrocaudal levels of the ventral brainstem, we aimed to selectively enhance the excitability of neurons driving active expiration, thereby revealing the extension of the pharmacological response and the most efficient site in generating active expiration.”

      We have edited the results as follows:

      “Importantly, the group with injection sites at +0.6 mm from VIIc exhibited the swiftest response onset, suggesting that this area is the most critical for the generation of active expiration, either through direct activation of the expiratory oscillator or, alternatively, for providing a strong tonic drive to late-E neurons located elsewhere.”

      In the introduction, it should also be emphasized that the pharmacological approach used in the present study complements the existing elegant chemogenetic studies, rather than emphasizing primarily the limitations of the chemogenetic inhibitions. The conclusion should be that these studies together provide different, yet complementary insights: The chemogenetic approach by inhibiting neurons, the present study by exciting neurons, and all studies come with their own limitations.

      Thanks for the suggestion, we have updated the manuscript as follows:

      “Although both of these elegant chemogenetic studies have contributed extensively to our understanding of the pFL, the existing evidence suggests that the expiratory oscillator may expand beyond the limits of the viral expression achieved in said studies, as proposed by Huckstepp et al., (2015).”

      Throughout the manuscript, the authors have to be cautious when implying that an excitatory effect relates to the activity of rhythmogenic pFL neurons. For example, on line 710 the authors state that "it is conceivable to infer that the rostral pFL is in the closest proximity to the cells responsible for the generation of active expiration". While it may indeed be "conceivable", the bicuculline injections themselves provide no insights into the location of neurons responsible for rhythmogenesis. It is equally "conceivable" that the excited neurons provide a tonic drive to the neurons without being involved in the generation of active expiration. These tonic neurons could be located at a distance from the presumed rhythmogenic core.

      We have included the possibility of tonic excitation in the technical considerations section:

      “However, our study did not include recording from late-E neurons following bicuculline injections, preventing us from definitively confirming whether the observed activity stems from late-E neuronal excitation or the potentiation of a tonic drive, particularly in the rostral areas.”

      (5) It is intriguing that some of their injections (Fig.2D) evoked postinspiratory activity. This interesting finding should be discussed as it could provide important insights into the coordination of the different phases of expiration.

      Thanks for the suggestion. We have included the following to the discussion:

      “Therefore, although it has previously been described, the exact mechanism by which this post-I ABD activity is generated is unclear. This late-E/post-I pattern of activity is similar to what has been observed in in vitro preparations and in vivo recordings in juvenile rats (Janczewski et al., 2002; Janczewski et al., 2006).

      “Therefore, although it has previously been described, it is currently unknown the exact mechanism by which this post-I activity in the ABD muscles is generated. For example the interplay between the rostral pFL and brainstem structures generating post-inspiratory activity, such as the proposed post-inspiratory oscillator (PiCo; Anderson et al., 2016) or pontine respiratory networks, could be reasonably involved in this process.”

      (6) The authors conducted bilateral disinhibition of the pFL, but only a unilateral photomicrograph was shown. Figure 2 should include a representative bilateral photomicrograph along with a scatter plot for clarity and completeness.

      We have edited figure 2 to include representative images of bilateral injections.

      (7) Regarding the Bicuculline injections in the Methods section: Aside from specifying exactly what type of bicuculline was used, the authors should provide more information about the pFL location and landmarks used, including the missing medial-lateral coordinate. The fluorobead spread of approximately ~300 µm, as observed in Figure 2C, is crucial for the interpretation of the results and should be detailed. An alternative approach could involve e.g. calculating the area covered by fluorobeads in each group.

      We have included the following in the text:

      “Each rat was injected at 2.8 mm lateral from the midline and at a specific RC coordinate based on the following groups: -0.2 mm from the caudal tip of the facial nucleus (VIIc) (n=5), +0.1 mm from VIIc (n=7), +0.4 mm from VIIc (n=5), +0.6 mm from VIIc (n=6), +0.8 mm from VIIc (n=5)”

      “These findings strongly suggest that bicuculline specifically activated cells within the vicinity of the injection sites which spread ~300 ìm (Figure 2C, horizontal lines) and did not activate PHOX2B+ cells in the RTN area, beyond their baseline level of activity.”

      (8) In the Experimental Protocol, the authors should provide more details on how the parameters were determined. For example, specify the number of cycles included for Dia frequency/amplitude, Abd frequency/amplitude, and with regards to the averaging process, the authors should specify over how many cycles they obtained an average for Dia/Abd activity time and AUC. The authors should also provide information on the number of bicuculline injections that they repeated to average these values and they should report the coefficient of variation for repeated injections. Please clarify the method used to calculate AUC, considering the non-linear nature of the activity.

      Only one bicuculline injection per rat was performed and the number of rats used for each injection site is indicated in the methods as follows:

      “Each rat was injected at 2.8 mm lateral from the midline and at a specific RC coordinate based on the following groups: -0.2 mm from the caudal tip of the facial nucleus (VIIc) (n=5), +0.1 mm from VIIc (n=7), +0.4 mm from VIIc (n=5), +0.6 mm from VIIc (n=6), +0.8 mm from VIIc (n=5), and CTRL (n=7). We recorded the physiological responses to the injection for 20-25 min.”

      We have clarified in the methods section the following:

      “Respiratory data was tracked in time bins of 2-minute duration from the baseline period prior to injections and spanned 20 min of recording post-injection. Mean-cycle measurements for each signal were computed by averaging values across all cycles within a given time bin.”

      Additional clarifications have been added:

      “We then used the average calculations of respiratory rate (RR), tidal volume (VT), Minute Ventilation (Ve), expiratory ABD amplitude, expiratory ABD area, VO2, VE/VO2 to obtain values relative to the baseline period. Peak responses were identified as the time bin that produced the strongest changes relative to baseline.”

      “Mean-cycle measurements for each signal were computed by averaging across all cycles within a given time bin. (~300 cycles in baseline, ~100 cycles per response time bin). We then used the average calculations of respiratory rate (RR), tidal volume (VT), Minute Ventilation (Ve), expiratory ABD amplitude, expiratory ABD area, VO2, VE/VO2 to obtain values relative to the baseline period. Peak responses were identified as the time bin that produced the strongest changes relative to baseline.”

      “The Area under the curve (AUC) was measured during baseline and was subtracted from the corresponding AUC of the response for each time bin (Figure 1C). This AUC measure was computed as the sum of the signal in a given respiratory phase as all signals were sampled at the same rate. Note that areas calculated below the zero- (0) line, as would be expected from a negative airflow during expiration, yields negative AUC values.”

      (9) The authors should explain how oxygen consumption was calculated-did it involve the Depocas & Hart (1957) formula? Please provide information on expiratory CO2, whether ventilation was adjusted to achieve consistent CO2 levels across animals, and ideally specify the end-tidal CO2 range for the experiments. Discuss the rationale behind the chosen CO2 levels and whether CO2-dependent pFL activity could have influenced results.

      We have clarified in the measurement in the methods as follows:

      “The gas analyzer measured fractional concentration of O2. Based on this and the flow rate at the level of the trachea (minute ventilation), we calculated O2 consumption according to Depocas and Hart (1957).”

      We have also added to the methods section:

      “During the entire experimental procedure, rats breathed spontaneously and end tidal CO2 was not adjusted through the experimental protocol.”

      In terms of the CO2-dependent pFL activity possibly influencing the results: by inducing active expiration in conditions in which there is no physiological demand for it (i.e. no hypoxia or hypercapnia), it is likely that pCO2 is reduced, overall decreasing the drive for ABD activity which would suggest that our results are likely an underestimation of the response that would have been produced if we maintained the CO2 levels constant.

      (10) The authors should address the discrepancy in fos-activated neurons between the control (44 neurons) and experimental animals (90-120 neurons per hemisection). Please explain the activation in the control group. Please also provide insights into how the authors interpret this difference in cfos-activated neurons between control and experimental groups.

      The following paragraph has been added to the discussion:

      “The assessment of cellular activity, quantified through cFos staining, unveiled the existence of basal activity in control rats. This observed baseline activity is likely emanating from subthreshold physiological processes within the parafacial area which do not culminate in ABD activity. Analysis of the cFos staining confirmed focal activation of neurons in the pFL of rats injected with bicuculline and minimal cFos expression in the PHOX2B+ cells in all groups as compared to the control group. These results confirm the very limited mediolateral spread of the drug from the core site of injection and back previous findings supporting the hypothesis that the majority of PHOX2B+ cells are more ventrally located in the parafacial area (pFV, Huckstepp et al., 2015) and PHOX2B+ cell recruitment is not necessary for active expiration (de Britto & Moraes, 2017; Magalhães et al., 2021).”

      (11) In Figure 8, the authors plotted the relationship of each cycle correlated to the normalized area. Have you also calculated the same late-E, inspiratory, and post-I to fR or VT separately?

      No, we only did the separated breathing phase (late-E, I, Post-I) analysis in the calculations of the DIA, airflow and ABD area, as well as on the Euclidean and Mahalanobis distances.

      Minor comments:

      Is there any specific reason for conducting these experiments exclusively in males?

      No, we usually use male rats for this type of experiments. We use both male and female rats for other studies that concern the effects of sex hormones but in this case, we performed experiments only in male rats.

      Page 13, Line 320: What is the duration of the bicuculline-induced effects?

      This information is included in the results section as follows:

      “Similarly, the ABD response duration was longer at the two most rostral locations (+0.6 mm = 17.6 ± 2.7 min; +0.8 = 17.1 ± 3.3 min) compared to the most caudal group (-0.2 mm = 2.4 ± 1.1 min; One-Way ANOVA p = 0.043; Tukey -0.2 mm vs +0.6 mm: p = 0.048; -0.2 mm vs +0.8 mm: p = 0.041; Figure 3E).”

      Page 16, Line 400: Is there a rationale for the high tidal volume (VT) observed in these animals? A baseline VT of 7 ml/kg appears notably elevated.

      Please note that rats were vagotomised and spontaneously breathing, hence the tidal volume is increased compared to non-vagotomised rats as seen in previous studies (Ouahchi et al., 2011).

      Figure 2D: Could you provide longer recordings? Additionally, incorporating diaphragm (Dia) recordings would enhance the interpretation of abdominal (Abd) recordings.

      Figure 3 A has a representative example of the 20 minute recordings for each location.

      Page 18, Line 458: Please rectify "Dunn: p , 0.001" to the appropriate format, perhaps "Dunn: p < 0.001."

      Thank you, edited.

    1. Reviewer #1 (Public Review):

      Summary:

      In "Changes in wing morphology..." Roy et al investigate the potential allometric scaling in wing morphology and wing kinematics in 8 different hoverfly species. Their study nicely combines different new and classic techniques, investigating flight in an important, yet understudied alternative pollinator. I want to emphasize that I have been asked to review this from a hoverfly biology perspective, as I do not work on flight kinematics. I will thus not review that part of the work.

      Strengths:

      The paper is well-written and the figures are well laid out. The methods are easy to follow, and the rationale and logic for each experiment are easy to follow. The introduction sets the scene well, and the discussion is appropriate. The summary sentences throughout the text help the reader.

      Weaknesses:

      The ability to hover is described as useful for either feeding or mating. However, several of the North European species studied here would not use hovering for feeding, as they tend to land on the flowers that they feed from. I would therefore argue that the main selection pressure for hovering ability could be courtship and mating. If the authors disagree with this, they could back up their claims with the literature. On that note, a weakness of this paper is that the data for both sexes are merged. If we agree that hovering may be a sexually dimorphic behaviour, then merging flight dynamics from males and females could be an issue in the interpretation. I understand that separating males from females in the movies is difficult, but this could be addressed in the Discussion, to explain why you do not (or do) think that this could cause an issue in the interpretation.

      The flight arena is not very big. In my experience, it is very difficult to get hoverflies to fly properly in smaller spaces, and definitely almost impossible to get proper hovering. Do you have evidence that they were flying "normally" and not just bouncing between the walls? How long was each 'flight sequence'? You selected the parts with the slowest flight speed, presumably to get as close to hovering as possible, but how sure are you that this represented proper hovering and not a brief slowdown of thrust?

      Your 8 species are evolutionarily well-spaced, but as they were all selected from a similar habitat (your campus), their ecology is presumably very similar. Can this affect your interpretation of your data? I don't think all 6000 species of hoverflies could be said to have similar ecology - they live across too many different habitats. For example, on line 541 you say that wingbeat kinematics were stable across hoverfly species. Could this be caused by their similar habitat?

    2. Reviewer #2 (Public Review):

      Summary

      Le Roy et al quantify wing morphology and wing kinematics across eight hoverfly species that differ in body mass; the aim is to identify how weight support during hovering is ensured. Wing shape and relative wing size vary significantly with body mass, but wing kinematics are reported to be size-invariant. On the basis of these results, it is concluded that weight support is achieved solely through size-specific variations in wing morphology and that these changes enabled hoverflies to decrease in size throughout their phylogenetic history. Adjusting wing morphology may be preferable compared to the alternative strategy of altering wing kinematics, because kinematics may be under strong evolutionary and ecological constraints, dictated by the highly specialised flight and ecology of the hoverflies.

      Strengths

      The study deploys a vast array of challenging techniques, including flight experiments, morphometrics, phylogenetic analysis, and numerical simulations; it so illustrates both the power and beauty of an integrative approach to animal biomechanics. The question is well motivated, the methods appropriately designed, and the discussion elegantly and convincingly places the results in broad biomechanical, ecological, evolutionary, and comparative contexts.

      Weaknesses

      (1) In assessing evolutionary allometry, it is key to identify the variation expected from changes in size alone. The null hypothesis for wing morphology is well-defined (isometry), but the equivalent predictions for kinematic parameters remain unclear. Explicit and well-justified null hypotheses for the expected size-specific variation in angular velocity, angle-of-attack, stroke amplitude, and wingbeat frequency would substantially strengthen the paper, and clarify its evolutionary implications.

      (2) By relating the aerodynamic output force to wing morphology and kinematics, it is concluded that smaller hoverflies will find it more challenging to support their body mass - a scaling argument that provides the framework for this work. This hypothesis appears to stand in direct contrast to classic scaling theory, where the gravitational force is thought to present a bigger challenge for larger animals, due to their disadvantageous surface-to-volume ratios. The same problem ought to occur in hoverflies, for wing kinematics must ultimately be the result of the energy injected by the flight engine: muscle. Much like in terrestrial animals, equivalent weight support in flying animals thus requires a positive allometry of muscle force output. In other words, if a large hoverfly is able to generate the wing kinematics that suffice to support body weight, an isometrically smaller hoverfly should be, too (but not vice versa). Clarifying the relation between the scaling of muscle force input, wing kinematics, and weight support would resolve the conflict between these two contrasting hypotheses, and considerably strengthen the biomechanical motivation and interpretation.

      (3) The main conclusion - that evolutionary miniaturization is enabled by changes in wing morphology - is only weakly supported by the evidence. First, although wing morphology deviates from the null hypothesis of isometry, the difference is small, and hoverflies about an order of magnitude lighter than the smallest species included in the study exist. Including morphological data on these species, likely accessible through museum collections, would substantially enhance the confidence that size-specific variation in wing morphology occurs not only within medium-sized but also in the smallest hoverflies, and has thus indeed played a key role in evolutionary miniaturization. Second, although wing kinematics do not vary significantly with size, clear trends are visible; indeed, the numerical simulations revealed that weight support is only achieved if variations in wing beat frequency across species are included. A more critical discussion of both observations may render the main conclusions less clear-cut, but would provide a more balanced representation of the experimental and computational results.

      In many ways, this work provides a blueprint for work in evolutionary biomechanics; the breadth of both the methods and the discussion reflects outstanding scholarship. It also illustrates a key difficulty for the field: comparative data is challenging and time-consuming to procure, and behavioural parameters are characteristically noisy. Major methodological advances are needed to obtain data across large numbers of species that vary drastically in size with reasonable effort, so that statistically robust conclusions are possible.

    3. Reviewer #3 (Public Review):

      The paper by Le Roy and colleagues seeks to ask whether wing morphology or wing kinematics enable miniaturization in an interesting clade of agile flying insects. Isometry argues that insects cannot maintain both the same kinematics and the same wing morphology as body size changes. This raises a long-standing question of which varies allometrically. The authors do a deep dive into the morphology and kinematics of eight specific species across the hoverfly phylogeny. They show broadly that wing kinematics do not scale strongly with body size, but several parameters of wing morphology do in a manner different from isometry leading to the conclusion that these species have changed wing shape and size more than kinematics. The authors find no phylogenetic signal in the specific traits they analyze and conclude that they can therefore ignore phylogeny in the later analyses. They use both a quasi-steady simplification of flight aerodynamics and a series of CFD analyses to attribute specific components of wing shape and size to the variation in body size observed. However, the link to specific correlated evolution, and especially the suggestion of enabling or promoting miniaturization, is fraught and not as strongly supported by the available evidence.

      The aerodynamic and morphological data collection, modeling, and interpretation are very strong. The authors do an excellent job combining a highly interpretable quasi-steady model with CFD and geometric morphometrics. This allows them to directly parse out the effects of size, shape, and kinematics.

      Despite the lack of a relationship between wing kinematics and size, there is a large amount of kinematic variation across the species and individual wing strokes. The absolute differences in Figure 3F - I could have a very large impact on force production but they do indeed not seem to change with body size. This is quite interesting and is supported by aerodynamic analyses.

      The authors switch between analyzing their data based on individuals and based on species. This creates some pseudoreplication concerns in Figures 4 and S2 and it is confusing why the analysis approach is not consistent between Figures 4 and 5. In general, the trends appear to be robust to this, although the presence of one much larger species weighs the regressions heavily. Care should be taken in interpreting the statistical results that mix intra- and inter-specific variation in the same trend.

      The authors based much of their analyses on the lack of a statistically significant phylogenetic signal. The statistical power for detecting such a signal is likely very weak with 8 species. Even if there is no phylogenetic signal in specific traits, that does not necessarily mean that there is no phylogenetic impact on the covariation between traits. Many comparative methods can test the association of two traits across a phylogeny (e.g. a phylogenetic GLM) and a phylogenetic PCA would test if the patterns of variation in shape are robust to phylogeny.

      The analysis of miniaturization on the broader phylogeny is incomplete. The conclusion that hoverflies tend towards smaller sizes is based on an ancestral state reconstruction. This is difficult to assess because of some important missing information. Specifically, such reconstructions depend on branch lengths and the model of evolution used, which were not specified. It was unclear how the tree was time-calibrated. Most often ancestral state reconstructions utilize a maximum likelihood estimate based on a Brownian motion model of evolution but this would be at odds with the hypothesis that the clade is miniaturizing over time. Indeed such an analysis will be biased to look like it produces a lot of changes towards smaller body size if there is one very large taxa because this will heavily weight the internal nodes. Even within this analysis, there is little quantitative support for the conclusion of miniaturization, and the discussion is restricted to a general statement about more recently diverged species. Such analyses are better supported by phylogenetic tests of directedness in the trait over time, such as fitting a model with an adaptive peak or others.

      Setting aside whether the clade as a whole tends towards smaller size, there is a further concern about the correlation of variation in wing morphology and changes in size (and the corresponding conclusion about lack of co-evolution in wing kinematics). Showing that there is a trend towards smaller size and a change in wing morphology does not test explicitly that these two are correlated with the phylogeny. Moreover, the subsample of species considered does not appear to recapitulate the miniaturization result of the larger ancestral state reconstruction.

      Given the limitations of the phylogenetic comparative methods presented, the authors did not fully support the general conclusion that changes in wing morphology, rather than kinematics, correlate with or enable miniaturization. The aerodynamic analysis across the 8 species does however hold significant value and the data support the conclusion as far as it extends to these 8 species. This is suggestive but not conclusive that the analysis of consistent kinematics and allometric morphology will extend across the group and extend to miniaturization. Nonetheless, hoverflies face many shared ecological pressures on performance and the authors summarize these well. The conclusions of morphological allometry and conserved kinematics are supported in this subset and point to a clade-wide pattern without having to support an explicit hypothesis about miniaturization.

      The data and analyses on these 8 species provide an important piece of work on a group of insects that are receiving growing attention for their interesting behaviors, accessibility, and ecologies. The conclusions about morphology vs. kinematics provide an important piece to a growing discussion of the different ways in which insects fly. Sometimes morphology varies, and sometimes kinematics depending on the clade, but it is clear that morphology plays a large role in this group. The discussion also relates to similar themes being investigated in other flying organisms. Given the limitations of the miniaturization analyses, the impact of this study will be limited to the general question of what promotes or at least correlates with evolutionary trends towards smaller body size and at what phylogenetic scale body size is systematically decreasing.

      In general, there is an important place for work that combines broad phylogenetic comparison of traits with more detailed mechanistic studies on a subset of species, but a lot of care has to be taken about how the conclusions generalize. In this case, since the miniaturization trend does not extend to the 8 species subsample of the phylogeny and is only minimally supported in the broader phylogeny, the paper warrants a narrower conclusion about the connection between conserved kinematics and shared life history/ecology.

    4. Author response:

      We thank the reviewers for their highly valuable comments and recommendations on our manuscript. We particularly appreciate receiving reviews from three distinct points of view, all highly relevant to our study (i.e. from an ecological, biomechanics, and evolutionary biology perspective).

      We will now carefully address all reviewer comments and questions, and resubmit a revised version in due time. Again, we thank the reviewers for their rigorous assessment of our study, which will greatly help us improving our manuscript.

    1. eLife assessment

      This article reports an important bioluminescence-based reporter system to evaluate kinase conformations. This assay is applied to four different kinases that have unique, very special regulatory features, thereby indicating that the assay can be used to provide convincing evidence on the conformational state of a large number of kinases. This paper will be of interest to researchers working on kinases and their conformational states.

    2. Reviewer #1 (Public Review):

      Summary:

      This technical report by Kugler at al., expands the application of a fluorescence-based reporter to study the conformational state of various kinases. This reporter, named KinCon (Kinase Conformation), interrogates the conformational state of a kinase (i.e., active vs. inactive) based on engineering complementary fusion proteins that fluoresce upon interaction. This assay has several advantages as it allows studying full-length kinases, that is, the kinase domain and regulatory domains, inside the cell and under various experimental conditions such as the presence of inhibitors or activator proteins, and in wildtype and mutants involved in disease states.

      Strengths:

      One major strength of this study is that it is quite comprehensive. The authors use KinCon for four different kinases, BRAF, LKB1, RIP and CDK4/6. These kinases have very different regulatory elements and associated proteins, which the authors explore to study their conformational state. Moreover, they use small molecule inhibitors or mutations to further dissect how the conformational state of the kinase in disease states. The collective set of results strongly suggests that KinCon is a versatile tool that can be used to study many kinases of biomedical and fundamental importance. Given that kinases are extensively studied by researchers in academia or industry, KinCon could have a broad impact as well.

      Weaknesses:

      This manuscript, however, also has several weaknesses that I outline below. These weaknesses decrease the overall level of impact on the manuscript, as is.<br /> • The manuscript is exceedingly long. For instance, the introduction provides background information for each kinase that is further expanded in the results section. I think the background information for each kinase in the Introduction and Results sections can be significantly reduced to highlight the major points. Otherwise, not only does the manuscript become too long, but also the main points get diluted.

      • Similarly, the figure legends are very long, providing information that is already in the main text or in Methods. The authors should provide the essential information to understand the figure.

      • A major concern throughout the manuscript is the use of the word "dynamics," which is used in the text in various contexts. The authors should clarify what they understand for dynamics of conformation. Are they measuring how the time-dependent process by which the kinase is interconverting between active and inactive states? It seems to me that the assays in this report evaluate a population of kinases that are in an open or close conformation (i.e., a particular state in each experimental condition) but there is not direct information how the kinase goes from one state to the other. In that sense, the use of dynamics is unclear. Also, the use of dynamics in different sentences in ambiguous. Here are a few examples but this should be revised throughout the manuscript:<br /> - Line 27: dynamics of full-length protein kinases. Is this referred to dynamics of conformational interconversion between inactive and active states?<br /> - Line 138: dynamic functioning of kinases. No clear what that means.<br /> - Line 276: ... alters KinCon dynamics. Not clear if they are measuring time-dependent process or a single point.<br /> - Figure legend 4F: dynamics of CDK4/6 reporters. Again, not clear how the assay is measuring dynamics.<br /> Nonetheless, in my opinion the authors use proper terminology that describes their assay in which the term dynamics is not used: Title (... impact of protein and small molecule interactions on kinase conformations) and Line 89 (... reporter can be used to track conformational changes of kinases...)

      • The authors use the phrase that KinCon has predictive capabilities (abstract and line 142). What do the authors refer to this?

      • The authors indicate that KinCon is a highly sensitive assay. Can the authors elaborate on what high sensitivity means? For example, can they discuss how other fluorescence-based approaches that are less sensitive would not be able to accomplish the same type of results or derive similar conclusions? Can they provide a resolution metric both in space and time? Given that the authors state that this is a technical report, this information is of relevance.

      • The authors nicely describe how KinCon works in Figure 1B and part of 1C. I do think that the bottom of panel 1C needs to be revised, as well as the text describing the potential scenarios of potency, efficacy and synergism.<br /> - One issue with this part of Figure 1C is that it is not clear what the x-axis in the 3 plots refer to. Is this time? Is this concentration of a small molecule, inhibitor or binding partner? This was confusing also in the context of the term dynamics used throughout the text. The terms potency, efficacy and synergism should be subtitles or the panels and the x-axis should be better defined, especially for a non-specialized reader.<br /> - Related to this part of Figure 1C is the text. The authors mention potency, effectiveness and synergy (Line 195). Can the authors use more fundamental terminology related to these three scenarios, for example, changes in activation constant, percent of protein activates? Also, why synergy is only related to effectiveness? Can synergy also be associated to potency?<br /> - Lastly, the use of these three cartoons gives the impression that the experimental results to come will follow a similar representation. Instead, the results are presented in bar plots for many different conditions. I think this will lead to confusion for a broad audience.

      • For a non-expert reader, can the authors clarify the use of tracking basal conformations vs. transient over-expression of the various KinCon constructs? Moreover, the authors use the term transient over-expression for 10, 16, 24 and 48 h (Line 203). This, to a non-expert reader, seems not transient.

      • Regarding Figure 1E and similar graphical representations: Why is the signal (RLU) non-linear with time? If the fluorescence of the KinCon construct is linearly related with its expression or concentration inside the cell, one would expect a linear increase. Have the authors plotted RLU/Expression band intensity to account for changes in protein concentration? For instance, some of the results within Figure 3 are normalized to concentration on the reporter expression level.

      • For the results with LKB1, the authors claim that intermediate fold change in fluorescence (Figure 2E) is due to a partially closed intermediate state (Line 262). Can the authors discard the possibility by which there is a change in populations of active and inactive that on average give intermediate values?

      • The authors claim in Line 274 that mutations located at the interface of the LKB1/STRADalkpha complex affect interactions and hypothesize that allosteric communication between LKB1 and STRADalpha is essential for function. Given that this mutations are at the interaction interface, why would the authors postulate an allosteric mechanism that evokes an effect distant to the interaction/active site? Could it be that function requires surface contacts alone that are disrupted by the mutations?

      • I was unable to find text to explain the following: Figure 2I shows the mutation R74A as n.s., but in the text only W308C is mentioned to not change fluorescence. Could the authors clarify why R74A is not discussed in the text? Maybe this reviewer missed the text in which it was discussed. Similarly, the author states in line 326 that the study included an analysis of RIPK2. However, I was unable to find results, graphs or additional text discussing RIPK2.

      • Some figures of RLU use absolute values, percentages and fold change. Is there a reason why the authors use different Y-axis values? These should be explained and justified in Methods. Similarly, bars for wt in Figures 3D, G, or 4D, E,F show no errors. How are the authors normalizing the data and repeats so that there is no error, and are they treating the rest of the data (i.e., mutants and/or treated with small molecules) in the same way?

      • Lastly, the section starting in Line 472 reads more like a discussion of results from different type of inhibitors used in this study that results on its own. The authors should consider a new subtitle as results or make this section a discussion.

    3. Reviewer #2 (Public Review):

      Summary:

      Protein kinases have been very successfully targeted with small molecules for several decades, with many compounds (including clinical drugs) bringing about conformational changes that are also relevant to broader interactions with the cellular signaling networks that they control. The authors set out to develop a targeted biosensor approach to evaluate distinct kinase conformations in cells for multiple kinases in the context of incoming signals, other proteins and small molecule binding, with a broad goal of using the KinCon assay to confirm (and perhaps predict) how drug binding or signal perception changes conformations and outputs in the presence of cellular complexes; this work will likely impact on the field with cellular reporters of kinase conformations a useful addition to the toolbox.

      Strengths:

      The KinCon reporter platform has previously been validated for well-known kinases; in this study, the team evaluate how to employ a full-length kinase (often containing a known pathological mutation). The sensitive detection method is based on a Renilla luciferase (RLuc)protein fragment complementation assay, where individual RLuc fragments are present at the N and the C terminus of the kinase. This report, which is both technical and practical in nature, co-expresses the kinase with known interactors (at low levels) in a high throughput format and then performs pharmacological evaluation with known small molecule kinase modulators. This is explained nicely in Figure 1, as are the signaling pathways that are being evaluated. Data demonstrate that V600E BRAF iexposed to vemurafenib is converted to the inactive conformation, as expected. In contrast, the more closed STRAD𝛼 and LKB1 KinCon conformations appear to represent the more active state of the complexed kinase, and a W308C mutation (evaluated alongside others) reverses this effect. The authors then evaluated necroptotic signaling in the context of RIPK1/3 under conditions where RIPK1 and RIPK3 are active, confirming that the reporters highlight the active states of both kinases. Exposure to compounds that are known to engage with the RIPK1 arm of the pathway induce bioluminescence changes consistent with the opening (inactivation) of the kinase. Finally, the authors move to an important drug target for which clinical drugs have arrived relatively recently; the CDK4/6 complexes. These are of additional importance because kinase-independent functions also exist for CDK6, and the effects of drugs in cells usually relies on a downstream marker, rather than demonstration of direct protein complex engagement. The data presented are interpreted as the formation of complexes with the CDK inhibitor p16INK4a; reducing the affinity of the interaction through mutations drives an inactive conformation, whilst the application of CDK4/6 inhibitors does not, implying binding to the active conformation.

      Weaknesses:

      (1) The work is very solid, and uses examples from the literature and also extends into new experimental space. An obvious weakness is mentioned by the authors for the CKDK data, in that measurements with Cyclin D (the activating subunit) are not characterised, although Cyclin D might be assumed to be present?<br /> (2) The work with the trimeric LKB1 complex involves pseudokinase, STRADalpha, whose conformation is also examined as a function of LKB1 status; since STRAD is an activator of LKB1, a future goal should be the evaluation of the complex in the presence of STRAD inhibitory/activating small molecules.

    4. Author response:

      The following is the authors’ response to the original reviews.

      We would like to thank you and the two Reviewers for the thoughtful evaluation of the manuscript and the support for publication. We have addressed all points raised by the two Reviewers.

      - We have extensively streamlined the manuscript. Repetitive passages regarding the respective kinase cascades have been removed.

      - We improved the presentation of the main Figures (mainly labeling and font size):

      - Figure 1: C, D, E, F o Figure 2: C, E, F, G, I, o Figure 3: D o Figure 4: F

      - Figure 5: A, B, C, D, E

      - We integrated new SI-data related to kinase functions, expression and the ‘cell-type comparisons’ of the KinCon reporter system (Figure Supplement 4, 5).

      Below you will find a detailed point-by-point response.

      Reviewer #1 (Recommendations For The Authors):

      Regarding the issue of the use of the word "dynamics," as described in the public review, here are a few examples of ambiguous use in different sentences: o Line 27: dynamics of full-length protein kinases. Is this referring to the dynamics of conformational interconversion between inactive and active states?

      - Line 138: dynamic functioning of kinases. It is not clear what this means. o Line 276: ... alters KinCon dynamics. Not clear if they are measuring time-dependent process or a single point. 

      - Figure legend 4F: dynamics of CDK4/6 reporters. Again, not clear how the assay is measuring dynamics.

      In my opinion, the authors use proper terminology that describes their assay in which the term dynamics is not used: Title: "... impact of protein and small molecule interactions on kinase conformations" and Line 89 "... reporter can be used to track conformational changes of kinases...".

      We have replaced the “dynamics” sections. 

      - Line 27: The understanding of the structural dynamics of…

      - Line 91: This reporter can be used to track dynamic changes of kinases conformations…

      - Line 139: Conventional methods often fall short in capturing the dynamics of kinases within their native cellular environments…

      - Line 146: Such insights into the molecular structure dynamics of kinases in intact cells…

      - Line 199: In order to enhance our understanding of kinase structure dynamics…

      - Line 276: These findings underline that indeed the trimeric complex formation alters….

      - Figure Legend 4F: Quantification of alterations of CDK4/6 KinCon reporter bioluminescence signals…

      The authors state that KinCon has predictive capabilities (abstract and line 142). What do  the authors mean by this?

      Previously we have benchmarked the suitability of the KinCon reporter for target engagement assays of wt and mutated kinase activities. With this we determined specificities of melanoma drugs for mutated BRAF variants (Mayrhofer 2020, PNAS). 

      The authors indicate that KinCon is a highly sensitive assay. Can the authors elaborate on what high sensitivity means?  

      With sensitivity we mean that we can detect conformation dynamics of the reporter at low expression levels of the hybrid protein expressed in the cell line of choice.

      - Line 209: Immunoblotting of cell lysates following luminescence measurements showed expression levels of the reporters in the range and below the endogenous expressed kinases (Figure 1E).  …

      - Line 219:   Using this readout, we showed that at expression levels of the BRAF KinCon reporter below the immunoblotting detection limit, one hour of drug exposure exclusively converted BRAF-V600E to the more closed conformation (Figure 1F, G, Figure Supplement 1B). 

      - Line 221: These data underline that at expression levels far below the endogenous kinase, protein activity conformations can be tracked in intact cells. …

      For example, can they discuss how other fluorescence-based approaches that are less sensitive would not be able to accomplish the same type of results or derive similar conclusions? Can they provide a resolution metric both in space and time? Given that the authors state that this is a technical report, this information is of relevance.

      We highlight the key pros & cons of the KinCon reporter technology in following sections:

      -Line 529: The KinCon technology, introduced here, seeks to address the previously mentioned challenges. It has the potential to become a valuable asset for tracking kinase functions in living cells which are hard to measure solely via phosphotransferase activities. Overall, it offers an innovative solution for understanding kinase activity conformations, which could pave the way for more novel intervention strategies for kinase entities with limited pharmaceutical targeting potential. So far, this relates to the tracking of kinase-scaffold and pseudo-kinase functions.

      - Line 535: Key advantages of the KinCon reporter technology is the robustness of the system to track kinase conformations at varying expression levels. However, in contrast to fluorescence-based reporter read-outs subcellular analysis and cell sorting are still challenging due to comparable low levels of light emission

      The authors nicely describe how KinCon works in Figure 1B and part of 1C. I do think that the bottom of panel 1C needs to be revised, as well as the text describing the potential scenarios of potency, efficacy, and synergism.

      One issue with this part of Figure 1C is that it is not clear what the x-axis in the 3 plots refers to. Is this time? Is this concentration of a small molecule, inhibitor, or binding partner? This was confusing also in the context of the term dynamics used throughout the text. The terms potency, efficacy, and synergism should be subtitles, or the panels and the x-axis should be better defined, especially for a non-specialized reader.

      Related to this part of Figure 1C is the text. The authors mention potency, effectiveness, and synergy (Line 195). Can the authors use more fundamental terminology related to these three scenarios, for example, changes in activation constant, and percent of protein activates? Also, why synergy is only related to effectiveness? Can synergy also be associated with potency?

      Thank you for bringing this up, we have revised Figure 1C to better reflect the mentioned effects of potency. To avoid confusion, we removed the illustration for drug synergism. Accordingly, we have integrated the axis descriptions for the presented dose-response curves.   

      Thus, we have further streamlined the text in the introduction – examples are shown below:

      - Line 195: Light recordings and subsequent calculations of time-dependent dosage variations of bioluminescence signatures of parallel implemented KinCon configurations aid in establishing dose-response curves. These curves are used for discerning pharmacological characteristics such as drug potency, effectiveness of drug candidates, and potential drug synergies (Figure 1C)

      - Figure 1C:  Shown is the workflow for the KinCon reporter construct engineering and analyses using KinCon technology. The kinase gene of interest is inserted into the multiple cloning site of a mammalian expression vector which is flanked by respective PCA fragments (-F[1], -F[2]) and separated with interjacent flexible linkers. Expression of the genetically encoded reporter in indicated multi-well formats allows to vary expression levels and define a coherent drug treatment plan. Moreover, it is possible to alter the kinase sequence (mutations) or to co-express or knock-down the respective endogenous kinase, interlinked kinases or proteinogenic regulators of the respective pathway. After systematic administration of pathway modulating drugs or drug candidates, analyses of KinCon structure dynamics may reveal alterations in potency, efficacy, and potential synergistic effects of the tested bioactive small molecules (schematic dose response curves are depicted)

      Lastly, the use of these three cartoons gives the impression that the experimental results to come will follow a similar representation. Instead, the results are presented in bar plots for many different conditions. I think this will lead to confusion for a broad audience.

      The bottom panel of Figure 1C is not the depiction of real experiments but rather an illustration of fitted dose-response curves. We would like to present previous demonstrations of doseresponse curves using BRAF KinCon data and ERK phosphorylation (Röck 2019, Sci. Advances) 

      We further agree with the reviewer and have therefore added a new part in the methods section addressing the evaluation of data extensively. 

      - Line 668: In Figure 1 E and F, a representative experiment of n=4 independent experiments is shown. In these cases, absolute bioluminescence values without any normalization are shown. Otherwise, data was indicated as RLU (relative light unit) fold change. This means the data was normalized on the indicated control condition (either with normalization of the western blot or without; as indicated.

      For a non-expert reader, can the authors clarify the use of tracking basal conformations vs. transient over-expression of the various KinCon constructs? Moreover, the authors use the term transient over-expression for 10, 16, 24, and 48 h (Line 203). This, to a non-expert reader, does not seem transient.

      We have revised the manuscript to clarify it:

      - Line 207: We showed that transient over-expression of these KinCon reporters for a time frame of 10h, 16h, 24h or 48h in HEK293T cells delivers consistently increasing signals for all KinCon reporters (Figure 1E, Figure Supplement 1A). 

      - Figure 1E) Representative KinCon experiments of time-dependent expressions of indicated KinCon reporter constructs in HEK293T cells are shown (mean ±SEM). Indicated KinCon reporters were transiently over-expressed in 24-well format in HEK293T cells for 10h, 16h, 24h and 48h each.

      Regarding Figure 1E and similar graphical representations: Why is the signal (RLU) nonlinear with time? If the fluorescence of the KinCon construct is linearly related to its expression or concentration inside the cell, one would expect a linear increase. Have the authors plotted RLU/Expression band intensity to account for changes in protein concentration? For instance, some of the results within Figure 3 are normalized to concentration on reporter expression level.

      Out intention was to show that varying expression levels can be used for the illustrated target engagement assays.Indeed, the represented elevations of RLU might be  due to factors such as: 

      - Doubling times of cells

      - Cell density

      - Media composition (which changes over time)

      - Reporter protein stabilities

      - Abundance of interactors of kinases

      For the results with LKB1, the authors claim that intermediate fold change in fluorescence (Figure 2E) is due to a partially closed intermediate state (Line 262). Can the authors discard the possibility by which there is a change in populations of active and inactive that on average give intermediate values?

      Based on our experience with KinCon reporter conformation states of kinases we tested so far, we assume that the presented data reflects an intermediate state. We agree that it needs further validation. We have changed the text accordingly:

      - Line 264: Upon interaction with LKB1 this conformation shifts to a partially closed intermediate state.

      The authors claim in Line 274 that mutations located at the interface of the LKB1/STRADalpha complex affect interactions and hypothesize that allosteric communication between LKB1 and STRADalpha is essential for function. Given that these mutations are at the interaction interface, why would the authors postulate an allosteric mechanism that evokes an effect distant from the interaction/active site? Could it be that function requires surface contacts alone that are disrupted by the mutations?

      We agree with the reviewer and changed our argumentation for this point:

      - Line 276: These findings underline that indeed the trimeric complex formation alters the opening and closing of the tested full-length kinase structures using the applied KinCon reporter read out

      I was unable to find text to explain the following: Figure 2I shows the mutation R74A as n.s., but in the text, only W308C is mentioned to not change fluorescence. Could the authors clarify why R74A is not discussed in the text?  Maybe this reviewer missed the text in which it was discussed.

      We adapted the manuscript and include the R74A mutation as followed:

      - Line 296: Among these mutations, only the W308C and R74A mutation prevented significant closing of the LKB1 conformation when co-expressed with STRAD𝛼 and MO25 (Figure 2I).

      In Figure 2I where the individual measurements of the LKB1-R74A KinCon are highlighted in red to better emphasize the deviations. In the case of the R74A mutation the effect seen might be due to the high deviation between the experiments (Highlighted in red). These deviations are much higher when compared to either the wt or the W308 mutant, and can also be seen in the LKB1-R74A-KinCon only condition (white). Even though no significant closing of the LKB1 conformation could be observed in the case of R74A, we believe, since the trend of the conformation closing upon complex formation is still visible that the effect is still there. Further replicates would be necessary to validate this theory. 

      Similarly, the authors state in line 326 that the study included an analysis of RIPK2. However, I was unable to find results, graphs, or additional text discussing RIPK2.

      The RIPK2 conformation was analyzed in Figure 3C (page 12).

      Some figures of RLU use absolute values, percentages, and fold change. Is there are reason why the authors use different Y-axis values? These should be explained and justified in Methods. Similarly, bars for wt in Figures 3D, G, or 4D, E, F show no errors. How are the authors normalizing the data and repeats so that there is no error, and are they treating the rest of the data (i.e., mutants and/or treated with small molecules) in the same way?

      We have changed the Y-axis values. Now, throughout the manuscript we show that there is a RLU fold-change. Except are selected experiments when solely absolute RLU values are shown (such as Figure 1E, F). We have also decided to integrate a paragraph into the methods section (Line 655). Figure 3D was changed as well.

      - Line 668: In Figure 1 E and F, a representative experiment of n=4 independent experiments is shown.  In these cases absolute bioluminescence values without any normalisation are shown.  Otherwise, data was indicated as RLU fold change. This means the data was normalized on the indicated control condition (either with normalization of the western blot or without; as indicated).

      The data is generally normalized on wt or untreated conditions, when the cells were treated with small molecules for target engagement assays. 

      Lastly, the section starting in Line 472 reads more like a discussion of results from different types of inhibitors used in this study that results on its own. The authors should consider a new subtitle such as results or make this section a discussion.

      We agree with the reviewer and this part of the results was split into a new section of the result:

      - Line 455: “Effect of different kinase inhibitor types on the KinCon reporter system”.

      Reviewer #2 (Recommendations For The Authors):

      I have a few suggestions, since the paper is a distillation of a vast amount of work and tells a useful story.

      (1) The work is very solid, uses examples from the literature, and also extends into new experimental space. An obvious weakness is mentioned by the authors for the CKD data, in that measurements with Cyclin D (the activating subunit) are not characterized, although Cyclin D might be assumed to be present. 

      We performed experiments with the CDK4/6 KinCon reporters and co-expressed CyclinD with a ratio of 1:3 (HEK293T cells, expression for 48h). However, in the context of inhibitor treatments we could not track conformation changes in these initial experiments. The cells were treated with the indicated CDK4/6i [1µM] for 3h. This seems to not impact the conformation of CDK4/6 wt or mutated KinCon reporters. There is a tendency that CyclinD co-expression promotes CDK4/6 conformation opening (data not shown).

      Author response image 1.

      Bioluminescence signal of CDK4/6 KinCon reporters with co-expressed CyclinD3 (HEK293T, expression for 48h) upon exposure to indicated CDK4/6i [1µM] or DMSO for 3h (mean ±SEM, n=3 ind. experiments). No significant changes using the current setting.

      (2) The work with the trimeric LKB1 complex involves pseudokinase, STRADalpha, whose conformation is also examined as a function of LKB1 status; since STRAD is an activator of LKB1. A future goal should be the evaluation of the complex in the presence of STRAD inhibitory/activating small molecules.

      Thank you for this great idea, we are currently compiling a FWF grant application to get support for such a R&D project.

      Minor points

      • Have any of the data been repeated in a different cell background? This came to mind because HeLa cells lack LKB1, which might be a useful place to test the LKB1 data in a different context.

      This experiment was performed and we show it in Figure Supplement 5. Further, we followed the advice of the reviewer and performed suggested experiments. We integrated the colon cancer cell line SW480 into the experimental setup. Overall, three cell settings showed the same pattern of KinCon reporter analyses for LKB1-STRADα-MO25 complex formation utilizing the LKB1- and STRADα-KinCon reporters.  

      • The study picks up the PKA Cushings Syndrome field, which makes sense, and data are presented for L206R. PMID 35830806 explains how different patient mutations drive different signaling outcomes through distinct complex formations, and it would be interesting to discuss how mutations in KinCon complexes, especially those with mutations, could affect sub-cellular localization. Could the authors explain if this was done for any of the proteins, whose low experimental expression is a clear advantage, but is presumably hard to maintain across experiments?

      The feedback of the reviewer motivated us to perform subcellular fractionation experiments. They were performed with PKAc wt and L206R KinCon reporters as well as BRAF wt and V600E reporters. We were not able to see major differences between the wt and mutated reporter constructs in respect to their nucleus: cytoplasm localizations (Figure Supplement 4). For your information, in a R+D project with the mitochondrial kinase PINK1 we see localization of the reporter as expected almost exclusively at the mitochondria fraction. 

      - Line 495: In this context of activating kinase mutations we showed that using PKAc (wt and L206R) and BRAF (wt and V600E) reporters as example we could not track alterations of cytoplasmic and nuclear localization (Figure Supplement 4). Furthermore, subcellular localization of PKAc KinCon reporters did not change when L206R mutant was introduced (Figure Supplement 4). As a control BRAF wt and V600E KinCon reporters were used and also no changes in localization was observed.

      • I suggest changing PMs (Figure 2 and others) simply to mutation, I read this as plasma membrane constantly.

      We agree and we have changed it to “patient mutation” in Figure 2C, Figure 3E, Figure 4B.

    1. eLife assessment

      This study presents a predictive scoring system in DLBCL based on the expression of three tumour microenvironment-related genes. Such a scoring system seems useful for predicting tumour purity levels in DLBCL. The provided evidence showing an association between worse DLBLC prognosis and high-risk score is solid, but it is incomplete to draw a clear conclusion about the links between risk score and drug sensitivity.

    2. Reviewer #2 (Public Review):

      In this study, Zhenbang Ye and colleagues investigate the links between microenvironment signatures, gene expression profiles, and prognosis in diffuse large B-cell lymphoma (DLBCL). They show that increased tumor purity (ie, a higher proportion of tumor cells relative to surrounding stromal components) is associated with worse prognosis. They then show that three genes associated with tumor purity (VCAN, CD3G, and C1QB) correlate with patterns of immune cell infiltration and can be used to create a risk scoring system that predicts prognosis, which can be replicated by immunohistochemistry (IHC), and response to some therapies.

      (1) The two strengths of the study are its relatively large sample size (n = 190) and the strong prognostic significance of the risk scoring system. It is worth noting that the validation of this scoring with IHC, a simple technique already routinely used for the diagnosis and classification of DLBCL, increases the potential for clinical translation. However, the correlative nature of the study limits the conclusions that can be drawn in regards to links between the risk scoring system, the tumor microenvironment, and the biology of DLBCL.

      (2) The tumor microenvironment has been extensively studied in DLBCL and a prognostic implication has already been established (for instance, Steen et al., Cancer Cell, 2021). In addition, associations have already been established in non-Hodgkin lymphoma between prognosis and expression of C1QB (Rapier-Sharman et al., Journal of Bioinformatics and Systems Biology, 2022), VCAN (S. Hu et al., Blood, 2013), and CD3G (Chen et al., Medical Oncology, 2022). Nevertheless, one of the strengths and novelty aspect of the study is the combination of these 3 genes into a risk score that is also valid by immunohistochemistry (IHC), which substantially facilitates a potential clinical translation.

      (3) Figures 1A-B: tumor purity is calculated using the ESTIMATE (Estimation of Stromal and Immune cells in Malignant Tumor tissues using Expression data) algorithm (Yoshihara et al., Nature Communications, 2013). The ESTIMATE algorithm is based on two gene signatures ("stromal" and "immune"). It is therefore expected that tumor purity measured by the ESTIMATE algorithm will correlate with the expression of multiple genes. Importantly, C1QB is included in the stromal signature of the ESTIMATE algorithm meaning that, by definition, it will be correlated with tumor purity in that setting.

      (4) Figure 2A: as established in figure 1C, high tumor purity is associated with worse prognosis. Later in the manuscript, it is also shown that C1QB expression is associated with worse prognosis. However, figure 2A shows that C1QB is associated with decreased tumor purity. It therefore makes it less likely that the prognostic role of C1QB expression is related to its impact on tumor purity. The prognostic impact could be related to different patterns of immune cell infiltration, as shown later. However, the evidence presented in the study is correlative and nature and not sufficient to draw this conclusion.

      (5) Figure 3G: although there is a strong prognostic implication of the risk score on prognosis, the correlation between the risk score and tumor purity is significant but not very strong (R = 0.376). It is therefore likely that other important biological factors explain the correlation between the risk score and prognosis, as suggested in the gene set enrichment analysis that is later performed.

      (6) Figure 6: the drug sensitivity analysis includes a wide range of established and investigational drugs with varied mechanisms of action. Although the difference in sensitivity between tumors with low and high risk scores show statistical significance for certain drugs, the absolute difference appears small in most cases and is of unclear biological significance. In addition, even though the risk score is statistically related to drug sensitivity, there is no direct evidence that the differences in drug sensitivity are directly related to tumor purity.

    3. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      The findings in this study are useful and may have practical implications for predicting DLBCL risk subject to further validating the bioinformatics outcomes. We found the approach and data analysis solid. However, some concerns regarding the drug sensitivity prediction and the links between the selected genes for the risk scores have been raised that need to be addressed by further functional works.

      Thanks for your high recognition for our study. In fact, we have searched the treatment information of DLBCL patients in our own cohort, however, unfortunately all patients were treated strictly according to the guidelines issued by authorities of China, which suit Chinese patients fine but do not include the drugs explored in the present study. Therefore, more further investigations should be designed and conducted to validate our conclusion. Here, we provided a possible direction for future studies base on large cohorts, which could not only provide more reliable conclusions, but gain more attentions to the role of tumor microenvironment in influencing outcome and drug sensitivity.

      Public Reviews:

      Sincere thanks for all reviewers’ positive comments on our study and their helpful recommendations for improving our manuscript. For this part, we have sorted out the comments and recommendations from all reviewers, and made corresponding revisions. And here are our responses.

      (1) How did we determined the three genes (VCAN, C1QB and CD3G) in the prognostic model?

      Just as was mentioned in the “Prognostic model” in Materials and Methods section, the gene was selected by “survival” package in R. After we obtained the nine genes, we input the expression value of them, and analyzed with “survival” package in R. And the function “step” in that package can optimize the model, that is, to construct a model with as less factors as possible, and the finally enrolled factors were representative and presented the least collinearity. Through this way, the prognostic model we got could be more practical in clinical practice.

      (2) Different centers have different protocols of IHC, so how could we put this model into clinical practice under this circumstance?

      Not only did different centers have different protocols, the materials like antibodies also vary. Therefore, there is actually a long way to go in putting our study into clinical practice. As far as we’re concerned, there are at least three problems to solve. First, diagnostic antibodies should be used in clinical practice, which usually manifest better specificity and sensitivity. And this may be the reason why the staining of VCAN and C1QB was strong and difficult to differentiate. Second, a standardized protocol should be made. Last but not least, more precise analyses and studies should be conducted to make it clear which type of cells specifically express these genes (just as was mentioned by Reviewer #2). We are now endeavoring to solve these problems by utilizing as many techniques as possible, like multi-omics and mIHC. From revealing the true expression pattern to developing high quality antibodies and even standardized test kit, we are looking forward to a clinical translation.

      (3) The analyses about immune infiltration and the key genes in DLBCL were superficial, limited within the correlation analyses.

      Due to the model constructed based on tumor purity of DLBCL, the risk score could be associated with the enrichment of cell functions. We conducted GSEA analysis based on the differentially expressed genes between high-risk group and low-risk group in the two datasets (Figure 5H-I). It showed that the extracellular organization and cellular adhesion were different between the two groups, in which way the immune infiltration and activity might be regulated owing to the motility of immune cells. Besides, we have validated the infiltration of M1 macrophages and M2 macrophages with our own cohort (Supplementary Figure 3P).

      (4) The drug sensitivity was just analyzed based on the model, which should be validated in real world research or lab study. And the sensitivity score seemed not different too much in most cases, even though there were statistical significance.

      We tried to search the treatment information of DLBCL patients in our own cohort, however, unfortunately all patients were treated strictly according to the guidelines issued by authorities of China, which suit Chinese patients fine but do not include the drugs explored in the present study. Therefore, more further investigations should be designed and conducted to validate our conclusion. Here, we provided a possible direction for future studies base on large cohorts, which could not only provide more reliable conclusions, but gain more attentions to the role of tumor microenvironment in influencing outcome and drug sensitivity. As for the differences between high- and low-risk group, as a matter of fact, sometimes a little dose of drug could have a huge effect, because the dose-effect curve is usually nonlinear. Therefore, reduce the dose, even just 1%, the adverse effects could be avoided. To sum up, the drug sensitivity analyses in our study could provide more possibility for clinical trial and practice, and we are taking it into consideration to design reasonable clinical research.

      (5) C1QB was associated with decreased tumor purity and worse prognosis, but decreased tumor purity was related to better prognosis. How to elucidate the contradiction?

      Just as discussed in Discussion section, previous studies have revealed the role of C1QB in promoting an immunosuppressive microenvironment in cancer (see reference 22-26). C1QB might recruit the infiltration of pro-tumor immune cells, resulting in a reducing tumor purity on its perspective. However, the immune microenvironment was regulated by multi factors which form a network and combat or synergize each other. The statistical analysis often gives a possible phenomenon, but could not provide mechanism explanation. Therefore, more mechanic studies are needed to reveal the connection and key node. This is exactly what we will explore next.

      (6) Others:

      (1) Line 51 has been rewritten.

      (2) References for ESTIMATE algorithm (reference 16) and CD3G+ T cells has been added (reference 17).

      (3) The illegible figure labels might be caused by the incompatibility between the PDF file we submitted and the submission system. We have provided the TIFF images in this revision, and the EPS file could be submitted to editors upon their requests.

      (4) A supplement description has been added to the Figure legend of Figure 6 to make it clear.

      (5) In order to explore the expression of key genes among different locations of DLBCL we performed analyses in Figure5 and supplementary Figure3. These results might be thought-provoking that the tumor microenvironment differs among DLBCLs even though they share similar histological characteristics.

    1. eLife assessment

      This paper describes an important software framework for the curation, retrieval, and analysis of ancient human genomic data and their associated metadata, overcoming long-standing coordination and harmonization issues in ancient human genomics. The resource is built on compelling and sometimes exceptional principles of software engineering and reproducibility, and the authors make an excellent case that their resource will be of practical use to many researchers studying human history using DNA. The main issues include natural uncertainties regarding future funding and maintenance of this resource, as well as deviation from established standards in other areas of genomics.

    2. Reviewer #1 (Public Review):

      The authors describe a framework for working with genotype data and associated metadata, specifically geared towards ancient DNA. The Poseidon framework aims to address long-standing data coordination issues in ancient population genomics research. These issues can usefully be thought of as two primary, separate problems:

      (1) The genotype merging problem. Often, genotype calls made by a new study are not made publicly available, or they are only made available in an ad-hoc fashion without consistency in formatting between studies. Other users will typically want to combine genotypes from many previously published studies with their own newly produced genotypes, but a lack of coordination and standards means that this is challenging and time-consuming.

      (2) The metadata problem. All genomes need informative metadata to be usable in analyses, and this is even more true for ancient genomes which have temporal and often cultural dimensions to them. In the ancient DNA literature, metadata is often only made available in inconsistently formatted supplementary tables, such that reuse requires painstakingly digging through these to compile, curate and harmonise metadata across many studies.

      Poseidon aims to solve both of these problems at the same time, and additionally provide a bit of population genetics analysis functionality. The framework is a quite impressive effort, that clearly has taken a lot of work and thought. It displays a great deal of attention to important aspects of software engineering and reproducibility. How much usage it will receive beyond the authors themselves remains to be seen, as there is always a barrier to entry for any new sophisticated framework. But in any case, it clearly represents a useful contribution to the human ancient genomics community.

      The paper is quite straightforward in that it mainly describes the various features of the framework, both the way in which data and metadata are organised, and the various little software tools provided to interact with the data. This is all well-described and should serve as a useful introduction for any users of the framework, and I have no concerns with the presentation of the paper. Perhaps it gets a bit too detailed for my taste at times, but it's up to the authors how they want to write the paper.

      I thus have no serious concerns with the paper. I do have some thoughts and comments on the various choices made in the design of the framework, and how these fit into the broader ecosystem of genomics data. I wouldn't necessarily describe much of what follows as criticism of what the authors have done - the authors are of course free to design the framework and software that they want and think will be useful. And the authors clearly have done more than basically anyone else in the field to tackle these issues. But I still put forth the points below to provide some kind of wider discussion within the context of ancient genomics data management and its future.

      * * *

      The authors state that there is no existing archive for genotype data. This is not quite true. There is the European Variation Archive (EVA, https://www.ebi.ac.uk/eva/), which allows archiving of VCFs and is interlinked to raw data in the ENA/SRA/DDBJ. If appropriately used, the EVA and associated mainstream infrastructure could in principle be put to good use by the ancient genomics community. In practice, it's basically not used at all by the ancient genomics community, and partly this is because EVA doesn't quite provide exactly what's needed (in particular with regards to metadata fields). Poseidon aims to provide a much more custom-tailored solution for the most common use cases within the human ancient DNA field, but it could be argued that such a solution is only needed because the ancient genomics community has largely neglected the mainstream infrastructure. In some sense, by providing such a custom-tailored solution that is largely independent of the mainstream infrastructure, I feel like efforts such as Poseidon (and AADR) - while certainly very useful - might risk contributing to further misaligning the ancient genomics community from the rest of the genomics community, rather than bringing it closer. But the authors cannot really be blamed for that - they are simply providing a resource that will be useful to people given the current state of things.

      The BioSamples database (https://www.ebi.ac.uk/biosamples/) is an attempt to provide universal sample IDs across the life sciences and is used by the archives for sequence reads (ENA/SRA/DDBJ). Essentially every published ancient sample already has a BioSample accession, because this is required for the submission of sequence reads to ENA/SRA/DDBJ. It would thus have seemed natural to make BioSamples IDs a central component of Poseidon metadata, so as to anchor Poseidon to the mainstream infrastructure, but this is not really done. There are some links being made to ENA in the .ssf "sequence source" files used by the Poseidon package, including sample accessions, but this seems more ad-hoc.

      The package uses PLINK and EIGENSTRAT file formats to represent genotypes, which in my view are not particularly good formats for long-term and rigorous data management in genomics. These file formats cannot appropriately represent multiallelic loci, haplotype phase, or store information on genotype qualities, coverage, etc. The standard in the rest of genomics is VCF, a much more robust and flexible format with better software built around it. Insisting on keeping using these arguably outdated formats is one way in which the ancient genomics community risks disaligning itself from the mainstream.

      I could not find any discussion of reference genomes: knowing the reference genome coordinate system is essential to using any genotype file. For comparison, in the EVA archive, every VCF dataset has a "Genome Assembly" metadata field specifying the accession number of the reference genome used. It would seem to me like a reference genome field should be part of a Poseidon package too. In practice, the authors likely use some variant of the hg19 / GRCh37 human reference, which is still widely used in ancient genomics despite being over a decade out of date. Insisting on using an outdated reference genome is one way in which the ancient genomics community is disaligning itself from the mainstream, and it complicates comparisons to data from other sub-fields of genomics.

      A fundamental issue contributing to the genome merging problem, not unique to ancient DNA, is that genotype files are typically filtered to remove sites that are not polymorphic within the given study - this means that files from two different studies will often contain different and not fully overlapping sets of sites, greatly complicating systematic merging. I don't see any discussion of how Poseidon deals with this. In practice, it seems the authors are primarily concerned with data on the commonly used 1240k array set, such that the set of SNPs is always well-defined. But does Poseidon deal with the more general problem of non-overlapping sites between studies, or is this issue simply left to the user to worry about? This would be of relevance to whole-genome sequencing data, and there are certainly plenty of whole-genome datasets of great interest to the research community (including archaic human genomes, etc).

      In principle, it seems the framework could be species-agnostic and thus be useful more generally beyond humans (perhaps it would be enough to add just one more "species" metadata field?). It is of course up to the authors to decide how broadly they want to cater.

    3. Reviewer #2 (Public Review):

      Summary:

      Schmid et al. provide details of their new data management tool Poseidon which is intended to standardise archaeogenetic genotype data and combine it with the associated standardised metadata, including bibliographic references, in a way that conforms to FAIR principles. Poseidon also includes tools to perform standard analyses of genotype files, and the authors pitch it as the potential first port of call for researchers who are planning on using archaeogenetic data in their research. In fact, Poseidon is already up and running and being used by researchers working in ancient human population genetics. To some extent, it is already on its way to becoming a fundamental resource.

      Strengths:

      A similar ancient genomics resource (The Ancient Allen Database) exists, but Poseidon is several steps ahead in terms of integration and standardisation of metadata, its intrinsic analytical tools, its flexibility, and its ambitions towards being independent and entirely community-driven. It is clear that a lot of thought has gone into each aspect of what is a large and dynamic package of tools and overall it is systematic and well thought through.

      Weaknesses:

      The main weakness of the plans for Poseidon, which admirably the authors openly acknowledge, is in how to guarantee it is maintained and updated over the long term while also shifting to a fully independent model. The software is currently hosted by the MPI, although the authors do set out plans to move it to a more independent venue. However, the core team comprising the authors is funded by the MPI, and so the MPI is also the main funder of Poseidon. The authors do state their ambition to move towards a community-driven independent model, but the details of how this would happen are a bit vague. The authors imagine that authors of archaeogenetic papers would upload data themselves, thereby making all authors of archaeogenetics papers the voluntary community who would take on the responsibility of maintaining Poseidon. Archaeogeneticists generally are committed enough to their field that there is a good chance such a model would work but it feels haphazard to rely on goodwill alone. Given there needs to be a core team involved in maintaining Poseidon beyond just updating the database, from the paper as it stands it is difficult to see how Poseidon might be weaned off MPI funding/primary involvement and what the alternative is. However, the same anxieties always surround these sorts of resources when they are first introduced. The main aim of the paper is to introduce and explain the resource rather than make explicit plans for its future and so this is a minor weakness of the paper overall.

    4. Author response:

      We thank the editors and reviewers for their thorough engagement with the manuscript and their well-informed comments on the Poseidon framework. We are pleased to note that they consider Poseidon a promising and timely attempt to resolve important issues in the archaeogenetics community. We also agree with the main challenges they raise, specifically the lack of long-term, independent infrastructure funding at the time of writing, and various aspects of Poseidon that bear the potential to further consolidate a de-facto alienation of the aDNA community from the wider field of genomics.

      Poseidon is indeed dependent on the Department of Archaeogenetics at MPI-EVA. For the short to middle-term future (3-5 years) we consider this dependency beneficial, providing a reliable anchor point and direct integration with one of the most proficient data-producing institutions in archaeogenetics. For the long term, as stated in the discussion section of the manuscript, we hope for a snowball effect in the dissemination and adoption of Poseidon to establish it as a valuable community resource that automatically attracts working time and infrastructure donations. To kickstart this process we have already intensified our active community outreach and teach Poseidon explicitly to (early career) practitioners in the field. We are aware of options to apply for independent infrastructure funding, for example through the German National Research Data Infrastructure (NFDI) initiative, and we plan to explore them further.

      As the reviewers have noted, key decisions in Poseidon’s data storage mechanism have been influenced by the special path archaeogenetics has taken compared to other areas of genomics. The founding goal of the framework was to integrate immediately with established workflows in the field. Nevertheless we appreciate the concrete suggestions on how to connect Poseidon better with the good practices that emerged elsewhere. We will explicitly address the European Variation Archive in a revised version of the manuscript, deliberate embedding the BioSamples ID of the INSDC databases more prominently in the .janno file, prioritise support for VCF next to EIGENSTRAT and PLINK and add an option to clearly document the relevant human reference genome on a per-sample level. In the revised version of the text we will also explain the treatment of non-overlapping SNPs between studies by trident’s forge algorithm and how we imagine the interplay of different call sets in the Poseidon framework in general.

      Beyond these bigger concerns we will also consider and answer the various more detailed recommendations thankfully shared by the reviewers, not least the question how we imagine Poseidon to be used by archaeologists and for archaeological data.

    1. eLife assessment

      The study presents valuable findings on the role of RIPK1 in maintaining liver homeostasis under metabolic stress. Strengths include the intriguing findings that RIPK1 deficiency sensitizes the liver to acute liver injury and apoptosis, but because the conclusions require additional experimental support, the evidence is incomplete.

    2. Reviewer #1 (Public Review):

      This study presents an investigation into the physiological functions of RIPK1 within the context of liver physiology, particularly during short-term fasting. Through the use of hepatocyte-specific Ripk1-deficient mice (Ripk1Δhep), the authors embarked on an examination of the consequences of Ripk1 deficiency in hepatocytes under fasting conditions. They discovered that the absence of RIPK1 sensitized the liver to acute injury and hepatocyte apoptosis during fasting, a finding of significant interest given the crucial role of the liver in metabolic adaptation. Employing a combination of transcriptomic profiling and single-cell RNA sequencing techniques, the authors uncovered intricate molecular mechanisms underlying the exacerbated proinflammatory response observed in Ripk1Δhep mice during fasting. While the investigation offers valuable insights into the consequences of Ripk1 deficiency in hepatocytes during fasting conditions, there appears to be a primarily descriptive nature to the study with a lack of clear connection between the experiments. Thus, a stronger focus is warranted, particularly on understanding the dialogue between hepatocytes and macrophages. Moreover, the data would benefit from reinforcement through additional experiments such as Western blotting, flow cytometry, and rescue experiments, which would offer a more quantitative aspect to the findings. By incorporating these enhancements, the study could achieve a more comprehensive understanding of the underlying mechanisms and ultimately strengthen the overall impact of the research.

      Detailed major concerns:

      Related to Figure 1.<br /> It is imperative to ensure consistency in the number of animals analyzed across the different graphs. The current resolution of the images appears to be low, resulting in unsharp visuals that hinder the interpretation of data beyond the presence of "white dots". To address this issue, it is recommended to enhance the resolution of the images and consider incorporating zoom-in features to facilitate a clearer visualization of the observed differences. Moreover, it would be beneficial to include a complete WB analysis for the cell death pathways analyzed. These adjustments will significantly improve the clarity and interpretability of Figure 1.

      Related to Figure 2.<br /> It is essential to ensure consistency in the number of animals analyzed across the different graphs, as indicated by n=6 in the figure legend (similar to Figure 1). Additionally, it is crucial to distinguish between male and female subjects in the dot plots to assess any potential gender-based differences, which should be consistent throughout the paper. To achieve this, the dots plot should be harmonized to clearly differentiate between males and females and investigate if there are any disparities between the genders. Moreover, it is imperative to correlate hepatic inflammation with the activation of Kupffer cells, infiltrating monocytes, and/or hepatic stellate cells (HSCs). Therefore, conducting flow cytometry would be instrumental in achieving this correlation. Additionally, the staining for Ki67 appears to be non-specific, showing a granular pattern reminiscent of bile crystals rather than the expected nuclear staining of hepatocytes or immune cells. It is crucial to ensure specific staining for Ki67, and conducting in vitro experiments on primary hepatocytes could further elucidate the proliferation process. These experiments are relatively straightforward to implement and would provide valuable insights into the mechanisms underlying hepatic inflammation and proliferation.

      Related to Figure 3 & related to Figure 4.<br /> The immunofluorescence data presented are not entirely convincing and are insufficient to conclusively demonstrate the recruitment of monocytes. Previous suggestions for flow cytometry studies remain pertinent and are indeed necessary to bolster the robustness of the data and conclusions. Conducting flow cytometry analyses would provide more accurate and quantitative assessments of monocyte recruitment, ensuring the reliability of the findings and strengthening the overall conclusions of the study. Regarding the single-cell RNA sequencing analysis presented in the manuscript, it's worth questioning its relevance and depth of information provided. While it successfully identifies a quantitative difference in the cellular composition of the liver between control and knockout mice, it may fall short in elucidating the intricate interactions between different cell populations, which are crucial for understanding the underlying mechanisms of hepatic inflammation. Therefore, I propose considering alternative bioinformatic analyses, such as CellPhone-CellChat, which could potentially provide a more comprehensive understanding of the cellular dynamics and interactions within the liver microenvironment. By examining the dialogue between different cell clusters, these analyses could offer deeper insights into the functional consequences of Ripk1 deficiency in hepatocytes and its impact on hepatic inflammation during fasting.

      Related to Figure 5.<br /> What additional insights do the data from Figure 5 provide compared to the study published in Nat Comms, which demonstrated that RIPK1 regulates starvation resistance by modulating aspartate catabolism (PMID: 34686667)?

      Related to Figure 6.<br /> The data presented in Figure 7 are complementary and do not introduce new mechanistic insights.

      Related to Figure 7.<br /> The data from Figure 7 suggest that RIPK1 in hepatocytes is responsible for the observed damage. However, it has been previously demonstrated that inhibition of RIPK1 activity in macrophages protects against the development of MASLD (PMID: 33208891). One possible explanation for these findings could be that the overreaction of macrophages to fasting, coupled with the absence of RIPK1 in hepatocytes (an indirect effect), contributes to the observed damage. Considering this, complementing hepatocytes with a kinase-dead version of RIPK1 could be a valuable approach to further refine the molecular aspect of the study. This would allow for a more precise investigation into the specific role of RIPK1's scaffolding or kinase function in response to starvation in hepatocytes. Such experiments could provide additional insights into the mechanisms underlying the observed effects and help delineate the contributions of RIPK1 in different cell types to metabolic stress responses.

    3. Reviewer #2 (Public Review):

      Summary:

      Zhang et al. analyzed the functional role of hepatocyte RIPK1 during metabolic stress, particularly its scaffold function rather than kinase function. They show that Ripk1 knockout sensitizes the liver to cell death and inflammation in response to short-term fasting, a condition that would not induce obvious abnormality in wild-type mice.

      Strengths:

      The findings are based on a knockout mouse model and supported by bulk RNA-seq and scRNA-seq. The work consolidates the complex role of RIPK1 in metabolic stress.

      Weaknesses:

      However, the findings are not novel enough because the pro-survival role of RIPK1 scaffold is well-established and several similar pieces of research already exist. Moreover, the mechanism is not very clear and needs additional experiments.

    4. Author response:

      We wish to express our sincere acknowledgement to the reviewers and the editors for the time and the effort spent in reviewing our manuscript. We highly appreciate the positive feedback and the thorough and constructive comments.

      We plan to conduct additional experiments to address the reviewers’ concerns.

      (1) We plan to utilize the RIPK1 kinase dead mice to investigate the role of RIPK1 kinase activity in these metabolic stress responses.

      (2) We plan to conduct flow cytometry analysis to detect the percentage or number of different cell types in fasted liver tissue, to provide more accurate and quantitative assessments of monocyte   recruitment.

      (3) We plan to conduct more western blotting to detect the expression of related molecules in the signal transduction pathway, to further clarify the underlying mechanisms.

      (4) Regarding the single-cell RNA sequencing analysis,we plan to conduct CellChat analysis to provide information about the interactions between different cell populations.

      (5) We will fix the issues regarding the data graphs and image resolutions.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Recommendations For The Authors):

      This study is very well framed and the writing is very clear. The manuscript is well organized and easy to follow and overall the previous state of the art of the field is taken into account.  I only have a couple of minor comments 

      (1) There is a preprint that uses single nuclei RNA-Seq and ST on human MS subcortical white matter lesions doi: https://doi.org/10.1101/2022.11.03.514906. This work needs to be included in the discussion of the results. 

      (1.1) We appreciate the reviewer bringing up this important preprint, and we have referenced it in the Discussion section of our updated manuscript. 

      (2) The discussion should include the overall limitations of the study and how much it can be translated to human MS. Specifically, the current work uses EAE and therefore different disease stages are not captured in this study. This point is also raised by other reviewers. 

      (1.2) We thank the reviewer for raising this important point, and we have included additional discussion about the limitations of EAE and its disease relevance to MS.

      Reviewer #2 (Recommendations For The Authors):

      The authors state that this EAE model is better for studying cortical gradients because previous models "such as directly injecting inflammatory cytokines into the meninges/cortex" cause a traumatic injury. It needs to be discussed that these models have now been superseded by more refined models involving long-term overexpression of pro-inflammatory cytokines in the sub-arachnoid space, thereby avoiding traumatic injury. The current results should be discussed in light of these newer models (James et al, 2020; 2022), which are more similar to MS cortical pathology and do exhibit lymphoid-like structures. 

      (2.1) We thank the reviewer for pointing out these relevant studies, and we agree they describe non-traumatic and more MS-relevant models of leptomeningeal inflammation. We have included discussion of these works in the updated manuscript.  

      • The study will be substantially improved if some of the ST data is validated at least partially with some RNAscope or other in situ hybridization using a subset of probes that capture the take-home message of the paper. 

      (2.2) We agree with the reviewer that validation of transcriptomics results is important to support our conclusions. In the updated manuscript Figure 5 and Supplemental Figure 6 we have added RNAscope results for relevant genes. In agreement with the trends noted in the manuscript, expression of genes related to antigen processing and presentation such as B2m decreases gradually with distance from LMI. We also have included a reference to a newly published manuscript from our group (Gupta et al., 2023, J. Neuroinflammation) that characterizes meningeal inflammation and sub-pial changes in the SJL EAE model. In that manuscript, IHC is used to show accumulation of B cells and T cells in the leptomeningeal space, increased microglial and astrocyte reactivity adjacent to leptomeningeal inflammation, and reduction of neuronal markers adjacent to leptomeningeal inflammation.  

      • The lack of change in signaling pathways involved in B-cell/T-cell interaction and cytokine/chemokine signaling, which would be expected in areas of immune cell aggregation in the meninges, needs discussion. 

      (2.3) While we detected significant upregulation in antigen presentation, complement activation, and humoral immune signaling, areas of meningeal inflammation identified as cluster 11 showed upregulation of numerous other GO gene sets associated with immune cell interaction and cytokine signaling, as described in supplementary table 3. These include T-cell receptor binding, CCR chemokine receptor binding, interleukin 8 production, response to interleukin 1, positive regulation of interleukin-6 production, tumor necrosis factor production, leukocyte cell-cell adhesion. Overall, we believe that the collection of enriched gene sets is consistent with peripheral myeloid and lymphoid infiltration and cytokine production, with the most prominent cytokine / pathways being interferon ɣ/antigen processing and presentation, complement, and humoral inflammation.

      • Fig 4 subclusters includes T-cell activation, pos regulation of neuronal death, cellular response to IFNg, neg regulation of neuronal projections, Ig mediated immune response, cell killing, pos regulation of programmed cell death, pos regulation of apoptotic process, but none of these are discussed despite their obvious importance. 

      (2.4) We agree with the reviewer that these upregulated genesets warrant additional discussion and have added additional reference to these genesets in the results section. Also, the genesets ‘positive regulation of programmed cell death’, ‘positive regulation of apoptotic process’, and ‘positive regulation of cell death’ were erroneously included in Figure 4F in the initial manuscript, as they are actually downregulated in cluster 1_4. This has been clarified in the text.

      • Subcluster 11 appears spatially to represent the meninges, but what pathways are expressed there? 330 genes/pathways altered independent of other clusters - immune cell regulation? 

      (2.5) We refer the reviewer to Supplementary Table 3, which contains a complete list of GO genesets enriched within cluster 11 spots.

      • The surprising lack of immunoglobulin genes upregulated in the meninges of the mice, considering these are the genes most upregulated in the MS meninges. Should be pointed out and discussed. 

      (2.6) We appreciate the reviewer bringing up immunoglobulin genes, which previous publications have shown are elevated in MS meninges and cortical grey matter lesions. Consistent with this, several immunoglobulin genes are elevated in cluster 11, including genes encoding IgG2b, IgA, and IgM. While these results were available within the original submission in Supplementary Table 2, we have included the graph in the updated Supplementary Figure 3.

      • Meningeal signature may be poorly represented given the individual slices shown in suppl 3A, which suggests that only 3 of the EAE slices had significant meningeal infiltrates, indicated by cluster 11 genes.  

      (2.7) There was heterogeneity in the location and extent of meningeal infiltrate / cluster 11 in the EAE slices, as the reviewer points out. 2 slices had severe inflammation, 2 had moderate inflammation, and 2 had relatively mild inflammation, but all EAE slices were enriched in inflammation relative to naïve as demonstrated not only through clustering, but also through enriched marker analysis between EAE and Naive and Progeny analysis.  

      • The ST is not resolving the meningeal tissue and the immediate underlying grey matter, as demonstrated by a high signal for both CXCL13 and GFAP in cluster 11. 

      (2.8) We agree that the spatial transcriptomics strategy applied here is inadequate to precisely delineate between meningeal inflammation and the underlying brain parenchyma, and that the elevation of markers such as GFAP in cluster 11 indicates some ‘contamination’ of parenchymal cells into cluster 11. We have clarified this in the text and discussed the limitation of the spatial transcriptomics method used.  

      • More information is required concerning how many animals were used in this study, to meet the requirements for complying with the 3Rs. 

      (2.9) A total of 4 mice were used per group. In the naïve group one mouse contributed two slices, for a total of 5 naïve slices. In the EAE group two mice contributed two slices, for a total of 6 EAE slices. We have clarified this in the methods section of the updated manuscript.

      Reviewer #3 (Recommendations For The Authors):

      The authors should provide a more thorough description of the methodology, and there are a few minor concerns about experimental details, data presentation, and description that need to be addressed. In the next few lines, I will highlight a few important aspects that need to be addressed, propose some changes to the main manuscript, and suggest some additional experiments that, if successful, could confirm/support/further strengthen the conclusions that are at this point purely based on transcriptomic data. 

      Major comments/suggestions: 

      • The main gene expression changes between the control and EAE groups obtained via spatial transcriptomics need to be validated with another technique, at least partially. I suggest performing RNAscope or immunofluorescence imaging using brain sections from a new and independent cohort of animals, where cell-specific markers can also be tested. This type of assessment would work as a validation method and could also inform about the cell-specific contribution to the observed transcriptomic changes. 

      (3.1) Please refer to response 2.2 

      • The representative qualitative spatial expression heatmaps for each gene in Fig. 1F should be accompanied by corresponding graphs with quantitative measurements. Similar to what is done regarding the data in Fig. 2B and D. 

      (3.2) We agree with the reviewer that quantitative graphs were missing, and we have included them in the updated Supplementary Figure 1. 

      • A supplementary table discriminating all the DEGs (132 up and 70 downregulated) between cluster 11 and the other clusters has to be provided. What is the contribution of recruited encephalitogenic adaptive immune cells to this cluster 11 gene signature? 

      (3.3) These unfiltered results are provided in Supplementary Table 2, and to view the up and down regulated genes the reader can sort the table based on fold change and adjusted P value. We believe providing the complete table is more useful to the reader, since the fold change and

      P value thresholds used to determine “significance” are arbitrary. Since the spatial transcriptomics method used in this work does not have single cell resolution, we cannot accurately estimate the contribution of encephalitogenic adaptive immune cells in cluster 11. However, given previously published work of lymphocyte infiltration into the subarachnoid space in SJL EAE (Gupta et al., 2023, J. Neuroinflammation) and the enrichment of Cd3e in cluster 11 (Log2FC 0.31, adjusted P-val 0.005) we assume some contribution of peripheral lymphocytes.

      • The authors mention that there is grey matter pathology in this relapse model, and this has been shown in a previous publication (Bhargava et al., 2021). However, the regions analyzed in the present study are different from the ones shown in the referenced paper. Is there an overexpression of genes involved in, or gene modules indicative of, neuronal stress and/or death that spatially overlap with clusters 1 and 2? If so, it would be important to provide information about those gene modules in the main figures. It would also be quite relevant to show the levels of cell stress/death proteins and of axonal stress/damage, by APP and/or nonphosphorylated SMI-32 staining, in the deep brain regions (like the thalamus), to corroborate the link between these phenomena and the gene signatures of subclusters 1_3, 1_4, and 2_6. 

      (3.4) We thank the review for this insightful comment. We have recently published a manuscript that histologically analyzes leptomeningeal inflammation in the SJL EAE model, specifically assessing the areas looked at in our submitted manuscript (Gupta et al., 2023, J. Neuroinflammation). In that manuscript, IHC is used to show accumulation of B cells and T cells in the leptomeningeal space, increased microglial and astrocyte reactivity adjacent to leptomeningeal inflammation, and reduction of neuronal markers adjacent to leptomeningeal inflammation. To further describe the gene modules in the inflammatory subclusters 1_3/1_4/2_6, we have now provided heatmaps of the selected genesets and their constituent genes (Supplementary Figure 5). 

      • It would be important to provide heatmaps discriminating the DEGs that make the gene modules that are significantly altered in subclusters 1_3, 1_4, and 2_6. The gene ontology terms are sometimes ambiguous. For instance, it would be very informative to the reader (and to the field) to know which altered genes compose the "lysosome", "immune response", "response to stress", or "B cell meditated immunity" pathways that are altered in the EAE subcluster 1_3 (Fig. 4E). The same applies to the gene modules altered in the other subclusters of interest. Authors should also consider generating a Venn diagram with the DEGs from subclusters 1_3, 1_4, and 2_6, to complement the GO term Venn presented in Fig. 4H. Having these pieces of information readily available, either as main or supplementary figures, would be a great addition. 

      (3.5) We agree with the reviewer on this point and have included these heatmaps in Supplementary Figure 5. 

      • The role of IFN-gamma as well as B cells (and Igs) in myelination/remyelination is mentioned in the discussion. However, there is very little evidence that these cells or their cytokines/Igs are mediating the described transcriptomic signatures at the level of the brain parenchyma of EAE mice undergoing relapse. Do the "antigen processing and presentation, cell killing, interleukin 6 production, and interferon gamma response" go terms, which better fitted the trajectory analysis, in fact include genes expressed almost exclusively by T and/or B cells? Are there genes that are downstream of IFN type I or II signaling? 

      (3.6) Pathways including antigen processing / presentation, humoral inflammation, complement, among others were enriched in areas of meningeal inflammation and adjacent areas of parenchyma. These signaling pathways are mediated by effector molecules, many of which are produced by lymphocytes, but that can act on cells within the CNS parenchyma. The heatmaps in Supplementary Figure 5 demonstrate the significant role of MHC and complement genes, which could be expressed by leukocytes as well as glia, on many of the pathways.

      • Is the transcriptomic overlap between meningeal and brain parenchymal regions, or the appearance of signatures similar to the parenchymal subclusters 1_3, 1_4, and 2_6, prevented if the mice are treated with the murine versions of natalizumab or rituximab prior relapse? 

      (3.6) We appreciate the reviewers suggestion. Our future directions for this work includes testing the effects of disease modifying therapies on spatial and single-cell transcriptomic readouts of disease in SJL EAE.

      • Please clarify what control group was used in this study. Naïve mice are mentioned in the Results section, does this mean that control animals were not injected with CFA? Authors should also elaborate on the descriptive methodology employed for the analysis of the spatial

      transcriptomics data - especially regarding the trajectory analysis. As is, overall, the methodology description might not favor reproducibility. 

      (3.7) We appreciate the need for clarification here. Our control group in this study was naïve, not having received any CFA or pertussis toxin. While often used as the control in EAE studies focused on mechanisms of autoimmunity, CFA and pertussis toxin independently induce systemic inflammation. Since in this study we were interested in neuroinflammation broadly, we chose to use a naïve comparison group to maximize our ability to find genes enriched in neuroinflammation. We have elaborated our methods section, including methods related to trajectory analysis. 

      Minor comments/suggestions: 

      In Fig. 1D the indication of the rostral to ventral axis needs to be inverted. 

      Addressed.

      In Fig. 1E the authors should also include a representative H&E staining of the same region in a control animal. 

      Addressed.

      There is inconsistency in the number of clusters obtained after UMAP unbiased clustering of the spatial transcriptomic data: 

      • Fig. 3A-E - twelve clusters are shown (cluster 0 to 11). 

      • In the Results section eleven clusters are mentioned - "we performed unbiased UMAP clustering on the spatial transcriptomic dataset and identified 11 distinct clusters".

      The text was incorrect, there were 12 distinct clusters. This has been corrected.

      Considering the mice strain used was SJL/J mice, the peptide used to induce EAE should be PLP139-151, as mentioned in the Methods section "Induction of SJL EAE". However, the legend of Fig. 1 mentions "post immunization with MOG 35-55". Please correct this. 

      Corrected.

      In the Methods section it is mentioned "At 12 weeks post-immunization, animals were euthanized", however the Results section mentions that tissues were harvested at 11 weeks post-immunization - "Brain slices were collected from four naïve mice and four EAE mice 11 weeks postimmunization". Please correct this. 

      The Methods were incorrect, this has now been fixed. 

      Please clarify the number of animals used for spatial transcriptomic analysis: 

      • Legend of Fig. 1 mentions "Red arrows indicate MRI time points, black arrow indicates time of tissue harvesting (N = 6)." Whilst in the Results section it states "Brain slices were collected from four naïve mice and four EAE mice". 

      The figure one legend has now been corrected (N = 4). Additionally, we have added clarification about the number of animals / slices used in the Methods section (see response 2.9).

      Please be consistent in the way of representing DEGs in the MA plots: 

      • Fig. 3F shows the upregulated genes (in red) on the right and the downregulated genes (in blue) on the left. 

      • Supplemental Fig. 2K shows the upregulated genes (in red) on the left and the downregulated genes (in blue) on the right. 

      • Supplemental Fig. 4 shows the upregulated genes on the right in blue, while the downregulated genes are in red. 

      This has been fixed.

      The letters attributed to each subcluster in panels E-G of Fig. 4 are different from the respective figure legend. 

      This has been fixed.

      Correct the legend of supplemental figure 2: o "(G-H) Representative spatial feature plots of read count (F) and UMI (G) demonstrate expected anatomic variability in transcript amount and diversity.". 

      This has been fixed.

      In Supplemental Fig. 4G there is probably an error with the XX axis, since the significantly up and down-regulated genes are not visible. 

      This has been fixed.

    2. eLife assessment

      Brain inflammation is a hallmark of multiple sclerosis. Using novel spatial transcriptomics methods, the authors provide solid evidence for a gradient of immune genes and inflammatory markers from the meninges toward the adjacent brain parenchyma in a mouse model. This important study advances our understanding of the mechanisms of brain damage in this autoimmune disease. However, the control mouse groups are not well designed to rule out confounding effects, a limitation that needs to be acknowledged and addressed.

    3. Reviewer 1 (Public Review):

      Multiple sclerosis (MS) is a debilitating autoimmune disease that causes loss of myelin in neurons of the central nervous system. MS is characterized by the presence of inflammatory immune cells in several brain regions as well as the brain barriers (meninges). This study aims to understand the local immune hallmarks in regions of the brain parenchyma that are adjacent to the leptomeninges in a mouse model of MS. The leptomeninges are known to be a foci of inflammation in MS and perhaps "bleed" inflammatory cells and molecules to adjacent brain parenchyma regions. To do so, they use novel technology called spatial transcriptomics so that the spatial relationships between the two regions remain intact. The study identifies canonical inflammatory genes and gene sets such as complement and B cells enriched in the parenchyma in close proximity to the leptomeninges in the mouse model of MS but not control. The manuscript is very well written and easy to follow. The results will become a useful resource to others working in the field and can be followed by time series experiments where the same technology can be applied to the different stages of the disease.

      Comments on revised version:

      I agree that the authors successfully addressed most of my comments/critiques.<br /> However, the fact that the control mice were not injected with CFA is somewhat concerning, because it will be hard to interpret the cause of the transcriptomic readouts described in this study. Some of the described effects might be due to CFA (which was used in the EAE but not the "naive" group), and not necessarily to the relapsing-remitting EAE immune features recapitulated in this mouse model. Moreover, this caveat associated with the "naive" control group is not being clearly stated throughout the manuscript and might go unnoticed to readers.<br /> The authors should clearly state, in the methods section (in the section "Induction of SJL EAE"), that the naive control group was not injected with CFA.<br /> Additionally, this potential confounder, of not using a control group injected with the same CFA regimen of the EAE group, should be mentioned in paragraph two of the discussion alongside the other limitations of the study already highlighted by the authors (or in another section of the discussion).

    4. Reviewer 2 (Public Review):

      Accumulating data suggests that the presence of immune cell infiltrates in the meninges of the multiple sclerosis brain contributes to the tissue damage in the underlying cortical grey matter by the release of inflammatory and cytotoxic factors that diffuse into the brain parenchyma. However, little is known about the identity and direct and indirect effects of these mediators at a molecular level. This study addresses the vital link between an adaptive immune response in the CSF space and the molecular mechanisms of tissue damage that drive clinical progression. In this short report the authors use a spatial transcriptomics approach using Visium Gene Expression technology from 10x Genomics, to identify gene expression signatures in the meninges and the underlying brain parenchyma, and their interrelationship, in the PLP-induced EAE model of MS in the SJL mouse. MRI imaging using a high field strength (11.7T) scanner was used to identify areas of meningeal infiltration for further study. They report, as might be expected, the upregulation of genes associated with the complement cascade, immune cell infiltration, antigen presentation, and astrocyte activation. Pathway analysis revealed the presence of TNF, JAK-STAT and NFkB signaling, amongst others, close to sites of meningeal inflammation in the EAE animals, although the spatial resolution is insufficient to indicate whether this is in the meninges, grey matter, or both.

      UMAP clustering illuminated a major distinct cluster of upregulated genes in the meninges and smaller clusters associated with the grey matter parenchyma underlying the infiltrates. The meningeal cluster contained genes associated with immune cell functions and interactions, cytokine production, and action. The parenchymal clusters included genes and pathways related to glial activation, but also adaptive/B-cell mediated immunity and antigen presentation. This again suggests a technical inability to resolve fully between the compartments as immune cells do not penetrate the pial surface in this model or in MS. Finally, a trajectory analysis based on distance from the meningeal gene cluster successfully demonstrated descending and ascending gradients of gene expression, in particular a decline in pathway enrichment for immune processes with distance from the meninges.

      Comments on revised version:

      The authors have addressed all of my comments regarding the lack of spatial resolution between the grey matter and the overlying meninges and also concerning the difficulties in extrapolating from this mouse model to MS itself.<br /> I am however very concerned about the lack of the correct control group. Immunization of rodents with complete freunds adjuvant (albeit with pertussis toxin) gives rise to widespread microglial activation, some immune cell infiltration and also structural changes to axons, particularly at nodes of Ranvier (https://doi.org/10.1097/NEN.0b013e3181f3a5b1). This will inevitably make it difficult to interpret the transcriptomics results, depending on whether these changes are reversible or not and the time frame of the reversal. In the C57Bl6 EAE models adjuvant induced microglial activation becomes chronic, whereas the axonal changes do reverse by 10 weeks. Whether this is the same in SJL EAE model using CFA alone is not clear.

    1. eLife assessment

      This study provides important insight into the mechanisms of proton-coupled oligopeptide transporters. It uses enhanced-sampling molecular dynamics (MD), backed by cell-based assays, revealing the importance of protonation of selected residues for PepT2 function. The simulation approaches are convincing, using long MD simulations, constant-pH MD and free energy calculations. Overall, the work has led to findings that will appeal to structural biologists, biochemists, and biophysicists studying membrane transporters.

    2. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      This study provides valuable information on the mechanism of PepT2 through enhanced-sampling molecular dynamics, backed by cell-based assays, highlighting the importance of protonation of selected residues for the function of a proton-coupled oligopeptide transporter (hsPepT2). The molecular dynamics approaches are convincing, but with limitations that could be addressed in the manuscript, including lack of incorporation of a protonation coordinate in the free energy landscape, possibility of protonation of the substrate, errors with the chosen constant pH MD method for membrane proteins, dismissal of hysteresis emerging from the MEMENTO method, and the likelihood of other residues being affected by peptide binding. Some changes to the presentation could be considered, including a better description of pKa calculations and the inclusion of error bars in all PMFs. Overall, the findings will appeal to structural biologists, biochemists, and biophysicists studying membrane transporters.

      We would like to express our gratitude to the reviewers for providing their feedback on our manuscript, and also for recognising the variety of computational methods employed, the amount of sampling collected and the experimental validation undertaken. Following the individual reviewer comments, as addressed point-by-point below, we have prepared a revised manuscript, but before that we address some of the comments made above in the general assessment:

      • “lack of incorporation of a protonation coordinate in the free energy landscape”.

      We acknowledge that of course it would be highly desirable to treat protonation state changes explicitly and fully coupled to conformational changes. However, at this point in time, evaluating such a free energy landscape is not computationally feasible (especially considering that the non-reactive approach taken here already amounts to almost 1ms of total sampling time).  Previous reports in the literature tend to focus on either simpler systems or a reduced subset of a larger problem.  As we were trying to obtain information on the whole transport cycle, we decided to focus here on non-reactive methods.

      • “possibility of protonation of the substrate”.

      The reviewers are correct in pointing out this possibility, which we had not discussed explicitly in our manuscript.  Briefly, while we describe a mechanism in which protonation of only protein residues (with an unprotonated ligand) can account for driving all the necessary conformational changes of the transport cycle, there is some evidence for a further intermediate protonation site in our data (as we commented on in the first version of the manuscript as well), which may or may not be the substrate itself. A future explicit treatment of the proton movements through the transporter, when it will become computationally tractable to do so, will have to include the substrate as a possible protonation site; for the present moment, we have amended our discussion to alert the reader to the possibility that the substrate could be an intermediate to proton transport. This has repercussions for our study of the E56 pKa value, where – if protons reside with a significant population at the substrate C-terminus – our calculated shift in pKa upon substrate binding could be an overestimate, although we would qualitatively expect the direction of shift to be unaffected. However, we also anticipate that treating this potential coupling explicitly would make convergence of any CpHMD calculation impractical to achieve and thus it may be the case that for now only a semi-quantitative conclusion is all that can be obtained.

      • “errors with the chosen constant pH MD method for membrane proteins”.

      We acknowledge that – as reviewer #1 has reminded us – the AMBER implementation of hybrid-solvent CpHMD is not rigorous for membrane proteins, and as such added a cautionary note to our paper.  We also explain how the use of the ABFE thermodynamic cycle calculations helps to validate the CpHMD results in a completely orthogonal manner (we have promoted this validation, which was in the supplementary figures, into the main text in the revised version).   We therefore remain reasonably confident in the results presented with regards to the reported pKa shift of E56 upon substrate binding, and suggest that if the impact of neglecting the membrane in the implicit-solvent stage of CpHMD is significant, then there is likely an error cancellation when considering shifts induced by the incoming substrate.

      • “dismissal of hysteresis emerging from the MEMENTO method”.

      We have shown in our method design paper how the use of the MEMENTO method drastically reduces hysteresis compared to steered MD for path generation, and find this improvement again for PepT2 in this study. We address reviewer #3’s concern about our presentation on this point by revising our introduction of the MEMENTO method, as detailed in the response below.

      • “the likelihood of other residues being affected by peptide binding”.

      In this study, we have investigated in detail the involvement of several residues in proton-coupled di-peptide transport by PepT2. Short of the potential intermediate protonation site mentioned above, the set of residues we investigate form a minimal set of sorts within which the important driving forces of alternating access can be rationalised.  We have not investigated in substantial detail here the residues involved in holding the peptide in the binding site, as they are well studied in the literature and ligand promiscuity is not the problem of interest here. It remains entirely possible that further processes contribute to the mechanism of driving conformational changes by involving other residues not considered in this paper. We have now made our speculation that an ensemble of different processes may be contributing simultaneously more explicit in our revision, but do not believe any of our conclusions would be affected by this.

      As for the additional suggested changes in presentation, we provide the requested details on the CpHMD analysis. Furthermore, we use the convergence data presented separately in figures S12 and S16 to include error bars on our 1D-reprojections of the 2D-PMFs in figures 3, 4 and 5. (Note that we have opted to not do so in figures S10 and S15 which collate all 1D PMF reprojections for the OCC ↔ OF and OCC ↔ IF transitions in single reference plots, respectively, to avoid overcrowding those necessarily busy figures). We have also changed the colours schemes of these plots in our revision to improve accessibility. We have additionally taken the opportunity to fix some typos and further clarified some other statements throughout the manuscript, besides the requests from the reviewers.

      Reviewer #1 (Public Review):

      The authors have performed all-atom MD simulations to study the working mechanism of hsPepT2. It is widely accepted that conformational transitions of proton-coupled oligopeptide transporters (POTs) are linked with gating hydrogen bonds and salt bridges involving protonatable residues, whose protonation triggers gate openings. Through unbiased MD simulations, the authors identified extra-cellular (H87 and D342) and intra-cellular (E53 and E622) triggers. The authors then validated these triggers using free energy calculations (FECs) and assessed the engagement of the substrate (Ala-Phe dipeptide). The linkage of substrate release with the protonation of the ExxER motif (E53 and E56) was confirmed using constant-pH molecular dynamics (CpHMD) simulations and cellbased transport assays. An alternating-access mechanism was proposed. The study was largely conducted properly, and the paper was well-organized. However, I have a couple of concerns for the authors to consider addressing.

      We would like to note here that it may be slightly misleading to the reader to state that “The linkage of substrate release with the protonation of the ExxER motif (E53 and E56) was confirmed using constant-pH molecular dynamics (CpHMD) simulations and cell-based transport assays.” The cellbased transport assays confirmed the importance of the extracellular gating trigger residues H87, S321 and D342 (as mentioned in the preceding sentence), not of the substrate-protonation link as this line might be understood to suggest.

      (1) As a proton-coupled membrane protein, the conformational dynamics of hsPepT2 are closely coupled to protonation events of gating residues. Instead of using semi-reactive methods like CpHMD or reactive methods such as reactive MD, where the coupling is accounted for, the authors opted for extensive non-reactive regular MD simulations to explore this coupling. Note that I am not criticizing the choice of methods, and I think those regular MD simulations were well-designed and conducted. But I do have two concerns.

      a) Ideally, proton-coupled conformational transitions should be modelled using a free energy landscape with two or more reaction coordinates (or CVs), with one describing the protonation event and the other describing the conformational transitions. The minimum free energy path then illustrates the reaction progress, such as OCC/H87D342-  →  OCC/H87HD342H →  OF/H87HD342H as displayed in Figure 3.

      We concur with the reviewer that the ideal way of describing the processes studied in our paper would be as a higher-dimensional free energy landscapes obtained from a simulation method that can explicitly model proton-transfer processes. Indeed, it would have been particularly interesting and potentially informative with regards to the movement of protons down into the transporter in the OF → OCC → IF sequence of transitions. As we note in our discussion on the H87→E56 proton transfer: 

      “This could be investigated using reactive MD or QM/MM simulations (both approaches have been employed for other protonation steps of prokaryotic peptide transporters, see Parker et al. (2017) and Li et al. (2022)).  However, the putative path is very long (≈ 1.7 nm between H87 and E56) and may or may not involve a large number of intermediate protonatable residues, in addition to binding site water. While such an investigation is possible in principle, it is beyond the scope of the present study.” 

      Where even sampling the proton transfer step itself in an essentially static protein conformation would be pushing the boundaries of what has been achieved in the field, we believe that considering the current state-of-the-art, a fully coupled investigation of large-scale conformational changes and proton-transfer reaction is not yet feasible in a realistic/practical time frame. We also note this limitation already when we say that:

      “The question of whether proton binding happens in OCC or OF warrants further investigation, and indeed the co-existence of several mechanisms may be plausible here”. 

      Nonetheless, we are actively exploring approaches to treat uptake and movement of protons explicitly for future work.

      In our revision, we have expanded on our discussion of the reasoning behind employing a non-reactive approach and the limitations that imposes on what questions can be answered in this study.

      Without including the protonation as a CV, the authors tried to model the free energy changes from multiple FECs using different charge states of H87 and D342. This is a practical workaround, and the conclusion drawn (the OCC→ OF transition is downhill with protonated H87 and D342) seems valid. However, I don't think the OF states with different charge states (OF/H87D342-, OF/H87HD342-, OF/H87D342H, and OF/H87HD342H) are equally stable, as plotted in Figure 3b. The concern extends to other cases like Figures 4b, S7, S10, S12, S15, and S16. While it may be appropriate to match all four OF states in the free energy plot for comparison purposes, the authors should clarify this to ensure readers are not misled.

      The reviewer is correct in their assessment that the aligning of PMFs in these figures is arbitrary; no relative free energies of the PMFs to each other can be estimated without explicit free energy calculations at least of protonation events at the end state basins. The PMFs in our figures are merely superimposed for illustrating the differences in shape between the obtained profiles in each condition, as discussed in the text, and we now make this clear in the appropriate figure captions.

      b) Regarding the substrate impact, it appears that the authors assumed fixed protonation states. I am afraid this is not necessarily the case. Variations in PepT2 stoichiometry suggest that substrates likely participate in proton transport, like the Phe-Ala (2:1) and Phe-Gln (1:1) dipeptides mentioned in the introduction. And it is not rigorous to assume that the N- and C-termini of a peptide do not protonate/deprotonate when transported. I think the authors should explicitly state that the current work and the proposed mechanism (Figure 8) are based on the assumption that the substrates do not uptake/release proton(s).

      This is indeed an assumption inherent in the current work. While we do “speculate that the proton movement processes may happen as an ensemble of different mechanisms, and potentially occur contemporaneously with the conformational change” we do not in the previous version indicate explicitly that this may involve the substrate. We make clear the assumption and this possibility in the revised version of our paper. Indeed, as we discuss, there is some evidence in our PMFs of an additional protonation site not considered thus far, which may or may not be the substrate. We now make note of this point in the revised manuscript.

      As for what information can be drawn from the given experimental stoichiometries, we note in our paper that “a 2:1 stoichiometry was reported for the neutral di-peptide D-Phe-L-Ala and 3:1 for anionic D-Phe-L-Glu. (Chen et al., 1999) Alternatively, Fei et al. (1999) have found 1:1 stoichiometries for either of D-Phe-L-Gln (neutral), D-Phe-L-Glu (anionic), and D-Phe-L-Lys (cationic).” 

      We do not assume that it is our place to arbit among the apparent discrepancies in the experimental data here, although we believe that our assumed 2:1 stoichiometry is additionally “motivated also by our computational results that indicate distinct and additive roles played by two protons in the conformational cycle mechanism”.

      (2) I have more serious concerns about the CpHMD employed in the study.

      a) The CpHMD in AMBER is not rigorous for membrane simulations. The underlying generalized Born model fails to consider the membrane environment when updating charge states. In other words, the CpHMD places a membrane protein in a water environment to judge if changes in charge states are energetically favorable. While this might not be a big issue for peripheral residues of membrane proteins, it is likely unphysical for internal residues like the ExxER motif. As I recall, the developers have never used the method to study membrane proteins themselves. The only CpHMD variant suitable for membrane proteins is the membrane-enabled hybrid-solvent CpHMD in CHARMM. While I do not expect the authors to redo their CpHMD simulations, I do hope the authors recognize the limitations of their method.

      We discuss the limitations of the AMBER CpHMD implementation in the revised version. However, despite that, we believe we have in fact provided sufficient grounds for our conclusion that substrate binding affects ExxER motif protonation in the following way.

      In addition to CpHMD simulations, we establish the same effect via ABFE calculations, where the substrate affinity is different at the E56 deprotonated vs protonated protein. This was figure S20 before, though in the revised version we have moved this piece of validation into a new panel of figure 6 in the main text, since it becomes more important with the CpHMD membrane problem in mind. Since the ABFE calculations are conducted with an all-atom representation of the lipids and the thermodynamic cycle closes well, it would appear that if the chosen CpHMD method has a systematic error of significant magnitude for this particular membrane protein system, there may be the benefit of error cancellation. While the calculated absolute pKa values may not be reliable, the difference made by substrate binding appears to be so, as judged by the orthogonal ABFE technique.

      Although the reviewer does “not expect the authors to redo their CpHMD simulations”, we consider that it may be helpful to the reader to share in this response some results from trials using the continuous, all-atom constant pH implementation that has recently become available in GROMACS (Aho et al 2022, https://pubs.acs.org/doi/10.1021/acs.jctc.2c00516) and can be used rigorously with membrane proteins, given its all-atom lipid representation.

      Unfortunately, when trying to titrate E56 in this CpHMD implementation, we found few protonationstate transitions taking place, and the system often got stuck in protonation state–local conformation coupled minima (which need to interconvert through rearrangements of the salt bridge network involving slow side-chain dihedral rotations in E53, E56 and R57). Author response image 1 shows this for the apo OF state, Author response image 2 shows how noisy attempts at pKa estimation from this data turn out to be, necessitating the use of a hybrid-solvent method.

      Author response image 1.

      All-atom CpHMD simulations of apo-OF PepT2. Red indicates protonated E56, blue is deprotonated.

      Author response image 2.

      Difficulty in calculating the E56 pKa value from the noisy all-atom CpHMD data shown in Author response image 1.

      b) It appears that the authors did not make the substrate (Ala-Phe dipeptide) protonatable in holosimulations. This oversight prevents a complete representation of ligand-induced protonation events, particularly given that the substrate ion pairs with hsPepT2 through its N- & C-termini. I believe it would be valuable for the authors to acknowledge this potential limitation. 

      In this study, we implicitly assumed from the outset that the substrate does not get protonated, which – as by way of response to the comment above – we now acknowledge explicitly. This potential limitation for the available mechanisms for proton transfer also applies to our investigation of the ExxER protonation states. In particular, a semi-grand canonical ensemble that takes into account the possibility of substrate C-terminus protonation may also sample states in which the substrate is protonated and oriented away from R57, thus leaving the ExxER salt bridge network in an apo-like state. The consequence would be that while the direction of shift in E56 pKa value will be the same, our CpHMD may overestimate its magnitude. It would thus be interesting to make the C-terminus protonatable for obtaining better quantitative estimates of the E56 pKa shift (as is indeed true in general for any other protein protonatable residue, though the effects are usually assumed to be negligible). We do note, however, that convergence of the CpHMD simulations would be much harder if the slow degree of freedom of substrate reorientation (which in our experience takes 10s to 100s of nanoseconds in this binding pocket) needs to be implicitly equilibrated upon protonation state transitions. We discuss such considerations in the revised paper.

      Reviewer #2 (Public Review):

      This is an interesting manuscript that describes a series of molecular dynamics studies on the peptide transporter PepT2 (SLC15A2). They examine, in particular, the effect on the transport cycle of protonation of various charged amino acids within the protein. They then validate their conclusions by mutating two of the residues that they predict to be critical for transport in cell-based transport assays. The study suggests a series of protonation steps that are necessary for transport to occur in Petp2. Comparison with bacterial proteins from the same family shows that while the overall architecture of the proteins and likely mechanism are similar, the residues involved in the mechanism may differ. 

      Strengths: 

      This is an interesting and rigorous study that uses various state-of-the-art molecular dynamics techniques to dissect the transport cycle of PepT2 with nearly 1ms of sampling. It gives insight into the transport mechanism, investigating how the protonation of selected residues can alter the energetic barriers between various states of the transport cycle. The authors have, in general, been very careful in their interpretation of the data. 

      Weaknesses: 

      Interestingly, they suggest that there is an additional protonation event that may take place as the protein goes from occluded to inward-facing but they have not identified this residue.

      We have indeed suggested that there may be an additional protonation site involved in the conformational cycle that we have not been able to capture, which – as we discuss in our paper – might be indicated by the shapes of the OCC ↔ IF PMFs given in Figure S15. One possibility is for this to be the substrate itself (see the response to reviewer #1 above) though within the scope of this study the precise pathway by which protons move down the transporter and the exact ordering of conformational change and proton transfer reactions remains a (partially) open question. We acknowledge this, denote it with question marks in the mechanistic overview we give in Figure 8 and also “speculate that the proton movement processes may happen as an ensemble of different mechanisms, and potentially occur contemporaneously with the conformational change”.

      Some things are a little unclear. For instance, where does the state that they have defined as occluded sit on the diagram in Figure 1a? - is it truly the occluded state as shown on the diagram or does it tend to inward- or outward-facing?

      Figure 1a is a simple schematic overview intended to show which structures of PepT2 homologues are available to use in simulations. This was not meant to be a quantitative classification of states. Nonetheless, we can note that the OCC state we derived has extra- and intracellular gate opening distances (as measured by the simple CVs defined in the methods and illustrated in Figure 2a) that indicate full gate closure at both sides. In particular, although it was derived from the IF state via biased sampling, the intracellular gate opening distance in the OCC state used for our conformational change enhanced sampling was comparable to that of the OF state (ie, full closure of the gate), see Figure S2b and the grey bars therein. Therefore, we would schematically classify the OCC state to lie at the center of the diagram in Figure 1a. Furthermore, it is largely stable over triplicates of 1 μslong unbiased MD, where in 2/3 replicates the gates remain stable, and the remaining replicate there is partial opening of the intracellular gate (as shown in Figure 2 b/c under the “apo standard” condition). We comment on this in the main text by saying that “The intracellular gate, by contrast, is more flexible than the extracellular gate even in the apo, standard protonation state”, and link it to the lower barrier for transition to IF than to OF. We did this by saying that “As for the OCC↔OF transitions, these results explain the behaviour we had previously observed in the unbiased MD of Figure 2c.” We acknowledge this was not sufficiently clear and have added details to the latter sentence to help clarify better the nature of the occluded state.

      The pKa calculations and their interpretation are a bit unclear. Firstly, it is unclear whether they are using all the data in the calculations of the histograms, or just selected data and if so on what basis was this selection done. Secondly, they dismiss the pKa calculations of E53 in the outward-facing form as not being affected by peptide binding but say that E56 is when there seems to be a similar change in profile in the histograms.

      In our manuscript, we have provided two distinct analyses of the raw CpHMD data. Firstly, we analysed the data by the replicates in which our simulations were conducted (Figure 6, shown as bar plots with mean from triplicates +/- standard deviation), where we found that only the effect on E56 protonation was distinct as lying beyond the combined error bars. This analysis uses the full amount of sampling conducted for each replicate. However, since we found that the range of pKa values estimated from 10ns/window chunks was larger than the error bars obtained from the replicate analysis (Figures S17 and S18), we sought to verify our conclusion by pooling all chunk estimates and plotting histograms (Figure S19). We recover from those the effect of substrate binding on the E56 protonation state on both the OF and OCC states. However, as the reviewer has pointed out (something we did not discuss in our original manuscript), there is a shift in the pKa of E53 of the OF state only. In fact, the trend is also apparent in the replicate-based analysis of Figure 6, though here the larger error bars overlap. In our revision, we added more details of these analyses for clarity (including more detailed figure captions regarding the data used in Figure 6) as well as a discussion of the partial effect on the E53 pKa value. 

      We do not believe, however, that our key conclusions are negatively affected. If anything, a further effect on the E53 pKa which we had not previously commented on (since we saw the evidence as weaker, pertaining to only one conformational state) would strengthen the case for an involvement of the ExxER motif in ligand coupling.

      Reviewer #3 (Public Review):

      Summary: 

      Lichtinger et al. have used an extensive set of molecular dynamics (MD) simulations to study the conformational dynamics and transport cycle of an important member of the proton-coupled oligopeptide transporters (POTs), namely SLC15A2 or PepT2. This protein is one of the most wellstudied mammalian POT transporters that provides a good model with enough insight and structural information to be studied computationally using advanced enhanced sampling methods employed in this work. The authors have used microsecond-level MD simulations, constant-PH MD, and alchemical binding free energy calculations along with cell-based transport assay measurements; however, the most important part of this work is the use of enhanced sampling techniques to study the conformational dynamics of PepT2 under different conditions. 

      The study attempts to identify links between conformational dynamics and chemical events such as proton binding, ligand-protein interactions, and intramolecular interactions. The ultimate goal is of course to understand the proton-coupled peptide and drug transport by PepT2 and homologous transporters in the solute carrier family. 

      Some of the key results include:

      (1) Protonation of H87 and D342 initiate the occluded (Occ) to the outward-facing (OF) state transition. 

      (2) In the OF state, through engaging R57, substrate entry increases the pKa value of E56 and thermodynamically facilitates the movement of protons further down. 

      (3) E622 is not only essential for peptide recognition but also its protonation facilitates substrate release and contributes to the intracellular gate opening. In addition, cell-based transport assays show that mutation of residues such as H87 and D342 significantly decreases transport activity as expected from simulations. 

      Strengths: 

      (1) This is an extensive MD-based study of PepT2, which is beyond the typical MD studies both in terms of the sheer volume of simulations as well as the advanced methodology used. The authors have not limited themselves to one approach and have appropriately combined equilibrium MD with alchemical free energy calculations, constant-pH MD, and geometry-based free energy calculations. Each of these 4 methods provides a unique insight regarding the transport mechanism of PepT2.

      (2) The authors have not limited themselves to computational work and have performed experiments as well. The cell-based transport assays clearly establish the importance of the residues that have been identified as significant contributors to the transport mechanism using simulations.

      (3) The conclusions made based on the simulations are mostly convincing and provide useful information regarding the proton pathway and the role of important residues in proton binding, protein-ligand interaction, and conformational changes.

      Weaknesses: 

      (1) Some of the statements made in the manuscript are not convincing and do not abide by the standards that are mostly followed in the manuscript. For instance, on page 4, it is stated that "the K64-D317 interaction is formed in only ≈ 70% of MD frames and therefore is unlikely to contribute much to extracellular gate stability." I do not agree that 70% is negligible. Particularly, Figure S3 does not include the time series so it is not clear whether the 30% of the time where the salt bridge is broken is in the beginning or the end of simulations. For instance, it is likely that the salt bridge is not initially present and then it forms very strongly. Of course, this is just one possible scenario but the point is that Figure S3 does not rule out the possibility of a significant role for the K64-D317 salt bridge. 

      The reviewer is right to point out that the statement and Figure S3 as they were do not adequately support our decision to exclude the K64-D317 salt-bridge in our further investigations. The violin plot shown in Figure S3, visualised as pooled data from unbiased 1 μs triplicates, did indeed not rule out a scenario where the salt bridge only formed late in our simulations (or only in some replicates), but then is stable. Therefore, in our revision, we include the appropriate time-series of the salt bridge distances, showing how K64-D317 is initially stable but then falls apart in replicate 1, and is transiently formed and disengaged across the trajectories in replicates 2 and 3. We have also remade the data for this plot as we discovered a bug in the relevant analysis script that meant the D170-K642 distance was not calculated accurately. The results are however almost identical, and our conclusions remain.

      (2) Similarly, on page 4, it is stated that "whether by protonation or mutation - the extracellular gate only opens spontaneously when both the H87 interaction network and D342-R206 are perturbed (Figure S5)." I do not agree with this assessment. The authors need to be aware of the limitations of this approach. Consider "WT H87-prot" and "D342A H87-prot": when D342 residue is mutated, in one out of 3 simulations, we see the opening of the gate within 1 us. When D342 residue is not mutated we do not see the opening in any of the 3 simulations within 1 us. It is quite likely that if rather than 3 we have 10 simulations or rather than 1 us we have 10 us simulations, the 0/3 to 1/3 changes significantly. I do not find this argument and conclusion compelling at all.

      If the conclusions were based on that alone, then we would agree.  However, this section of work covers merely the observations of the initial unbiased simulations which we go on to test/explore with enhanced sampling in the rest of the paper, and which then lead us to the eventual conclusions.

      Figure S5 shows the results from triplicate 1 μs-long trajectories as violin-plot histograms of the extracellular gate opening distance, also indicating the first and final frames of the trajectories as connected by an arrow for orientation – a format we chose for intuitively comparing 48 trajectories in one plot. The reviewer reads the plot correctly when they analyse the “WT H87-prot” vs “D342A H87-prot” conditions. In the former case, no spontaneous opening in unbiased MD is taking place, whereas when D342 is mutated to alanine in addition to H87 protonation, we see spontaneous transition in 1 out of 3 replicates.  However, the reviewer does not seem to interpret the statement in question in our paper (“the extracellular gate only opens spontaneously when both the H87 interaction network and D342-R206 are perturbed”) in the way we intended it to be understood. We merely want to note here a correlation in the unbiased dataset we collected at this stage, and indeed the one spontaneous opening in the case comparison picked out by the reviewer is in the condition where both the H87 interaction network and D342-R206 are perturbed. In noting this we do not intend to make statistically significant statements from the limited dataset. Instead, we write that “these simulations show a large amount of stochasticity and drawing clean conclusions from the data is difficult”. We do however stand by our assessment that from this limited data we can “already appreciate a possible mechanism where protons move down the transporter pore” – a hypothesis we investigate more rigorously with enhanced sampling in the rest of the paper. We have revised the section in question to make clearer that the unbiased MD is only meant to give an initial hypothesis here to be investigated in more detail in the following sections. In doing so, we also incorporate, as we had not done before, the case (not picked out by the reviewer here but concerning the same figure) of S321A & H87 prot. In the third replicate, this shows partial gate opening towards the end of the unbiased trajectory (despite D342 not being affected), highlighting further the stochastic nature that makes even clear correlative conclusions difficult to draw.

      (3) While the MEMENTO methodology is novel and interesting, the method is presented as flawless in the manuscript, which is not true at all. It is stated on Page 5 with regards to the path generated by MEMENTO that "These paths are then by definition non-hysteretic." I think this is too big of a claim to say the paths generated by MEMENTO are non-hysteretic by definition. This claim is not even mentioned in the original MEMENTO paper. What is mentioned is that linear interpolation generates a hysteresis-free path by definition. There are two important problems here: (a) MEMENTO uses the linear interpolation as an initial step but modifies the intermediates significantly later so they are no longer linearly interpolated structures and thus the path is no longer hysteresisfree; (b) a more serious problem is the attribution of by-definition hysteresis-free features to the linearly interpolated states. This is based on conflating the hysteresis-free and unique concepts. The hysteresis in MD-based enhanced sampling is related to the presence of barriers in orthogonal space. For instance, one may use a non-linear interpolation of any type and get a unique pathway, which could be substantially different from the one coming from the linear interpolation. None of these paths will be hysteresis-free necessarily once subjected to MD-based enhanced sampling techniques.

      We certainly do not intend to claim that the MEMENTO method is flawless. The concern the reviewer raises around the statement "These paths are then by definition non-hysteretic" is perhaps best addressed by a clarification of the language used and considering how MEMENTO is applied in this work. 

      Hysteresis in the most general sense denotes the dependence of a system on its history, or – more specifically – the lagging behind of the system state with regards to some physical driver (for example the external field in magnetism, whence the term originates). In the context of biased MD and enhanced sampling, hysteresis commonly denotes the phenomenon where a path created by a biased dynamics method along a certain collective variable lags behind in phase space in slow orthogonal degrees of freedom (see Figure 1 in Lichtinger and Biggin 2023, https://doi.org/10.1021/acs.jctc.3c00140). When used to generate free energy profiles, this can manifest as starting state bias, where the conformational state that was used to seed the biased dynamics appears lower in free energy than alternative states. Figure S6 shows this effect on the PepT2 system for both steered MD (heavy atom RMSD CV) + umbrella sampling (tip CV) and metadynamics (tip CV). There is, in essence, a coupled problem: without an appropriate CV (which we did not have to start with here), path generation that is required for enhanced sampling displays hysteresis, but the refinement of CVs is only feasible when paths connecting the true phase space basins of the two conformations are available. MEMENTO helps solve this issue by reconstructing protein conformations along morphing paths which perform much better than steered MD paths with respect to giving consistent free energy profiles (see Figure S7 and the validation cases in the MEMENTO paper), even if the same CV is used in umbrella sampling. 

      There are still differences between replicates in those PMFs, indicating slow conformational flexibility propagated from end-state sampling through MEMENTO. We use this to refine the CVs further with dimensionality reduction (see the Method section and Figure S8), before moving to 2D-umbrella sampling (figure 3). Here, we think, the reviewer’s point seems to bear. The MEMENTO paths are ‘non-hysteretic by definition’ with respect to given end states in the sense that they connect (by definition) the correct conformations at both end-states (unlike steered MD), which in enhanced sampling manifests as the absence of the strong starting-state bias we had previously observed (Figure S7 vs S6). They are not, however, hysteresis-free with regards to how representative of the end-state conformational flexibility the structures given to MEMENTO really were, which is where the iterative CV design and combination of several MEMENTO paths in 2D-PMFs comes in. 

      We also cannot make a direct claim about whether in the transition region the MEMENTO paths might be separated from the true (lower free energy) transition paths by slow orthogonal degrees of freedom, which may conceivably result in overestimated barrier heights separating two free energy basins. We cannot guarantee that this is not the case, but neither in our MEMENTO validation examples nor in this work have we encountered any indications of a problem here.

      We hope that the reviewer will be satisfied by our revision, where we replace the wording in question by a statement that the MEMENTO paths do not suffer from hysteresis that is otherwise incurred as a consequence of not reaching the correct target state in the biased run (in some orthogonal degrees of freedom).

      Recommendations for the authors:

      Reviewer #2 (Recommendations For The Authors): 

      Figure S1: it would be useful to label the panels.

      We have now done this.

      At the bottom of page 4, it is written that "the extracellular gate only opens spontaneously when both the H87 interaction network and D342-R206 are perturbed (Figure S5)." But it is hard to interpret that from the figure.  

      See also our response to reviewer #3. We have revised the wording of this statement, and also highlight in Figure S5 the crucial runs we are referring to, in order to make them easier to discern.

      At the bottom of page 5, and top of page 6, there is a lot of "other" information shown, which is inserted for the record - this is a bit glossed over and hard to follow.

      The “other” information refers to further conditions we had calculated PMFs for and that gave some insight, but which were secondary for drawing our key conclusions. We thank the reviewer for their feedback that this section needs clarification. We have revised this paragraph to make it easier to follow and highlight better the conclusions we draw form the data.

      In Figure 7 it looks as though the asterisks have shifted.

      We are indebted to the reviewer for spotting this error, the asterisks are indeed shifted one bar to the right of their intended position. The revised version fixes this issue.

      Reviewer #3 (Recommendations For The Authors):

      Minor points: In Figure 1a, The 7PMY label and arrow are slightly misplaced.

      Figure 1a is a schematic diagram to show the available structures of PepT2 homologues (see also the response to reviewer #2 above). The 7PMY label placement is intentional to indicate a partially occluded inwards-facing state. As we write in the figure caption: “Intermediate positions between states indicate partial gate opening”.

    3. Reviewer #1 (Public Review):

      The authors have performed all-atom MD simulations to study the working mechanism of hsPepT2. It is widely accepted that conformational transitions of proton-coupled oligopeptide transporters (POTs) are linked with gating hydrogen bonds and salt bridges involving protonatable residues, whose protonation triggers gate openings. Through unbiased MD simulations, the authors identified extra-cellular (H87 and D342) and intra-cellular (E53 and E622) triggers. The authors then validated these triggers using free energy calculations (FECs) and assessed the engagement of the substrate (Ala-Phe dipeptide). The linkage of substrate release with the protonation of the ExxER motif (E53 and E56) was confirmed using constant-pH molecular dynamics (CpHMD) simulations and cell-based transport assays. An alternating-access mechanism was proposed. The study was largely conducted properly, and the paper was well-organized. However, I have a couple of concerns for the authors to consider addressing.

      (1) As a proton-coupled membrane protein, the conformational dynamics of hsPepT2 is closely coupled to protonation events of gating residues. Instead of using semi-reactive methods like CpHMD or reactive methods such as reactive MD, where the coupling is accounted for, the authors opted for extensive non-reactive regular MD simulations to explore this coupling. Note that I am not criticizing the choice of methods, and I think those regular MD simulations were well designed and conducted. But I do have two concerns.<br /> (a) Ideally, proton-coupled conformational transitions should be modelled using a free energy landscape with two or more reaction coordinates (or CVs), with one describing the protonation event and the others describing the conformational transitions. The minimum free energy path then illustrates the reaction progress, such as OCC/H87D342- ↔ OCC/H87HD342H ↔ OF/H87HD342H as displayed in Figure 3. Without including the protonation as a CV, the authors tried to model the free energy changes from multiple FECs using different charge states of H87 and D342. This is a practical workaround, and the conclusion drawn (the OCC↔OF transition is downhill with protonated H87 and D342) seems valid. However, I don't think the OF states with different charge states (OF/H87D342-, OF/H87HD342-, OF/H87D342H, and OF/H87HD342H) are equally stable, as plotted in Figure 3b. The concern extends to other cases like Figures 4b, S7, S10, S12, S15, and S16. While it may be appropriate to match all four OF states in the free energy plot for comparison purposes, the authors should clarify this to ensure readers are not misled.<br /> (b) Regarding the substrate impact, it appears that the authors assumed fixed protonation states. I am afraid this is not necessarily the case. Variations in PepT2 stoichiometry suggests that substrates likely participate in proton transport, like the Phe-Ala (2:1) and Phe-Gln (1:1) dipeptides mentioned in the introduction. And it is not rigorous to assume that the N- and C-termini of a peptide do not protonate/deprotonate when transported. I think the authors should explicitly state that the current work and the proposed mechanism (Figure 8) are based on the assumption that the substrates do not uptake/release proton(s).

      (2) I have more serious concerns about the CpHMD employed in the study.<br /> (a) The CpHMD in AMBER is not rigorous for membrane simulations. The underlying generalized Born model fails to consider the membrane environment when updating charge states. In other words, the CpHMD places a membrane protein in a water environment to judge if changes in charge states are energetically favorable. While this might not be a big issue for peripheral residues of membrane proteins, it is likely unphysical for internal residues like the ExxER motif. As I recall, the developers have never used the method to study membrane proteins themselves. The only CpHMD variant suitable for membrane proteins is the membrane-enabled hybrid-solvent CpHMD in CHARMM. While I do not expect the authors to redo their CpHMD simulations, I do hope the authors recognize the limitation of their method.<br /> (b) It appears that the authors did not make the substrate (Ala-Phe dipeptide) protonatable in holo-simulations. This oversight prevents a complete representation of ligand-induced protonation events, particularly given that the substrate ion-pairs with hsPepT2 through its N- & C-termini. I believe it would be valuable for the authors to acknowledge this potential limitation.

    4. Reviewer #2 (Public Review):

      Summary:

      This is an interesting manuscript that describes a series of molecular dynamics studies on the peptide transporter PepT2 (SLC15A2). They examine, in particular, the effect on the transport cycle of protonation of various charged amino acids within the protein. They then validate their conclusions by mutating two of the residues that they predict to be critical for transport in cell-based transport assays. The study suggests a series of protonation steps that are necessary for transport to occur in Petp2. Comparison with bacterial proteins from the same family show that while the overall architecture of the proteins and likely mechanism are similar, the residues involved in the mechanism may differ.

      Strengths:

      This is an interesting and rigorous study that uses various state of the art molecular dynamics techniques to dissect the transport cycle of PepT2 with nearly 1ms of sampling. It gives insight into the transport mechanism, investigating how protonation of selected residues can alter the energetic barriers between various states of the transport cycle. The authors have, in general, been very careful in their interpretation of the data.

      Weaknesses:

      Interestingly, they suggest that there is an additional protonation event that may take place as the protein goes from occluded to inward-facing (clear from Figure 8) but as the authors comment they have not identified this residue(s).

    5. Reviewer #3 (Public Review):

      Summary:

      Lichtinger et al. have used an extensive set of molecular dynamics (MD) simulations to study the conformational dynamics and transport cycle of an important member of the proton-coupled oligopeptide transporters (POTs), namely SLC15A2 or PepT2. This protein is one of the most well-studied mammalian POT transporters that provides a good model with enough insight and structural information to be studied computationally using advanced enhanced sampling methods employed in this work. The authors have used microsecond-level MD simulations, constant-PH MD, and alchemical binding free energy calculations along with cell-based transport assay measurements; however, the most important part of this work is the use of enhanced sampling techniques to study the conformational dynamics of PepT2 under different conditions.

      The study attempts to identify links between conformational dynamics and chemical events such as proton binding, ligand-protein interactions, and intramolecular interactions. The ultimate goal is of course to understand the proton-coupled peptide and drug transport by PepT2 and homologous transporters in the solute carrier family.

      Some of the key results include (1) Protonation of H87 and D342 initiate the occluded (Occ) to the outward-facing (OF) state transition; (2) In the OF state, through engaging R57, substrate entry increases the pKa value of E56 and thermodynamically facilitates the movement of protons further down; (3) E622 is not only essential for peptide recognition but also its protonation facilitates substrate release and contributes to the intracellular gate opening. In addition, cell-based transport assays show that mutation of residues such as H87 and<br /> D342 significantly decrease transport activity as expected from simulations.

      Strengths:

      (1) This is an extensive MD based study of PepT2, which is beyond the typical MD studies both in terms of the sheer volume of simulations as well as the advanced methodology used. The authors have not limited themselves to one approach and have appropriately combined equilibrium MD with alchemical free energy calculations, constant-pH MD and geometry-based free energy calculations. Each of these 4 methods provides a unique insight regarding the transport mechanism of PepT2.

      (2) The authors have not limited themselves to computational work and has performed experiments as well. The cell-based transport assays clearly establish the importance of the residues that have been identified as significant contributors to the transport mechanism using simulations.

      (3) The conclusions made based on the simulations are mostly convincing and provide useful information regarding the proton pathway and the role of important residues in proton binding, protein-ligand interaction, and conformational changes.

      Weaknesses:

      There are inherent limitations with the methodology used such as the MEMENTO and constant pH MD that have been briefly noted in the manuscript.

    1. Author response:

      Public Reviews:

      Reviewer #1 (Public Review):

      In the manuscript by Su et al., the authors present a massively parallel reporter assay (MPRA) measuring the stability of in vitro transcribed mRNAs carrying wild-type or mutant 5' or 3' UTRs transfected into two different human cell lines. The goal presented at the beginning of the manuscript was to screen for effects of disease-associated point mutations on the stability of the reporter RNAs carrying partial human 5' or 3' UTRs. However, the majority of the manuscript is dedicated to identifying sequence components underlying the differential stability of reporter constructs. This shows that TA dinucleotides are the most predictive feature of RNA stability in both cell lines and both UTRs.

      The effect of AU rich elements (AREs) on RNA stability is well established in multiple systems, and the present study confirms this general trend but points out variability in the consequence of seemingly similar motifs on RNA stability. For example, the authors report that a long stretch of Us has extreme opposite effects on RNA stability depending on whether it is preceded by an A (strongly destabilizing) or followed by an A (strongly stabilizing). While the authors interpretation of a context- dependence of the effect is certainly well-founded, it seems counterintuitive that the preceding or following A would be the (only) determining factor. This points to a generally reductionist approach taken by the authors in the analysis of the data and in their attempt to dissect the contribution of "AU rich sequences" to RNA stability, with a general tendency to reduce the size and complexity of the features (e.g. to dinucleotides). While this certainly increases the statistical power of the analysis due to the number of occurrences of these motifs, it limits the interpretability of the results. How do TA dinucleotides per se contribute to destabilizing the RNA, both in 5' and 3' UTRs, but (according to limited data presented) not in coding sequences? What is the mechanism? RBPs binding to TA dinucleotide containing sequences are suggested to "mask" the destabilizing effect, thereby leading to a more stable RNA. Gain of TA dinucleotides is reported to have a destabilizing effect, but again no hypothesis is provided as to the underlying molecular mechanism. In addition to reducing the motif length to dinucleotides, the notion of "context dependence" is used in a very narrow sense; especially when focusing on simple and short motifs, a more extensive analysis of the interdependence of these features (beyond the existing analysis of the relationship between TA- diNTs and GC content) could potentially reveal more of the context dependence underlying the seemingly opposite behavior of very similar motifs.

      The contribution of coding region sequence to RNA stability has been extensively discussed (For example: doi.org/10.1016/j.molcel.2022.03.032; doi.org/10.1186/s13059-020-02251-5; doi.org/10.15252/embr.201948220; doi.org/10.1371/journal.pone.0228730; doi.org/10.7554/eLife.45396). While TA content at the third codon position (wobble position) has been implicated as a pro-degradation signal, codon optimality has emerged as the most prominent determinant for RNA stability. This indicates that the role of coding regions in RNA stability differs from that of UTRs due to the involvement of translation elongation. We did not intend to suggest that TA-dinucleotides in UTRs and coding regions have the same effect.

      We hypothesize that TA-dinucleotide may recruit endonucleases RNase A family, whose catalytic pockets exhibit a strong bias for TA dinucleotide (doi.org/10.1016/j.febslet.2010.04.018). Structures or protein bindings that blocks this recognition might stabilize RNAs. To gain further insight into the motif interactions, we plan to investigate the interactions between TA and other 15 dinucleotides through more detailed analyses.

      The present MPRAs measures the effect of UTR sequences in one specific reporter context and using one experimental approach (following the decay of in vitro transcribed and transfected RNAs). While this approach certainly has its merits compared to other approaches, it also comes with some caveats: RNA is delivered naked, without bound RBPs and no nuclear history, e.g. of splicing (no EJCs), editing and modifications. One way to assess the generalizability of the results as well as the context dependence of the effects is to perform the same analysis on existing datasets of RNA stability measurements obtained through other methods (e.g. transcription inhibition). Are TA dinucleotides universally the most predictive feature of RNA half-lives?

      Our system studies the stability control of RNA synthesized in vitro and delivered into human cells. While we did not intend to generalize our conclusions to endogenous RNAs, our approach contributes to the understanding of in vitro synthesized RNA used for cellular expression, such as in vaccines. It is known that endogenous RNAs undergo very different regulation. The most prominent factors controlling endogenous RNA stability are the density of splice junctions and the length of UTRs (doi.org/10.1186/s13059-022-02811-x; doi.org/10.1186/s12915-021-00949-x). To decipher the sequence regulation, these factors are controlled in our experiments. Therefore we do not expect the dinucleotide features found by our approach to be generalized as the most predictive feature of RNA half-life in vivo.

      The authors conclude their study with a meta-analysis of genes with increased TA dinucleotides in 5' and 3'UTRs, showing that specific functional groups are overrepresented among these genes. In addition, they provide evidence for an effect of disease-associated UTR mutations on endogenous RNA stability. While these elements link back to the original motivation of the study (screening for effects of point mutations in 5' and 3' UTRs), they provide only a limited amount of additional insights.

      We utilized the Taiwan Biobank to investigate whether mutations significantly affecting RNA stability also impact human biochemical measurements. Our findings indicate that these mutations indeed have a significant effect on various biochemical indices. This highlights the importance of our study, as it bridges basic science with potential applications in precision medicine. By linking specific UTR mutations with measurable changes in biochemical indices, our research underscores the potential for these findings to inform targeted medical interventions in the future.

      In summary, this manuscript presents an interesting addition to the long-standing attempts at dissecting the sequence basis of RNA stability in human cells. The analysis is in general very comprehensive and sound; however, at times the goal of the authors to find novelty and specificity in the data overshadows some analyses. One example is the case where the authors try to show that TA-dinucleotides and GC content are decoupled and not merely two sides of the same coin. They claim that the effect of TA dinucleotides is different between high- and low-GC content contexts but do not control for the fact that low GC-content regions naturally will contain more TA dinucleotides and therefore the effect sizes and the resulting correlation between TA-diNT rate and stability will be stronger (Fig. 5A). A more thorough analysis and greater caution in some of the claims could further improve the credibility of the conclusions.

      Low GC content implies a higher TA content but does not directly equate to a high TA-diNT rate. For instance, the sequence ATTGAACCTT has a lower GC content (0.3) compared to TATAGGCCGC (0.6), yet it also has a lower TA-diNT rate (0 vs. 0.22). To address this concern more rigorously, we performed a stratified analysis based on TA-diNT rate. As shown in our Fig. S7C, even after stratifying by TA-diNT rate (upper panel high TA-diNT rate / lower panel low TA-diNT rate), we still observe that the destabilizing effect of TA is stronger in the low GC content group.

      Reviewer #2 (Public Review):

      Summary of goals:

      Untranslated regions are key cis-regulatory elements that control mRNA stability, translation, and translocation. Through interactions with small RNAs and RNA binding proteins, UTRs form complex transcriptional circuitry that allows cells to fine-tune gene expression. Functional annotation of UTR variants has been very limited, and improvements could offer insights into disease relevant regulatory mechanisms. The goals were to advance our understanding of the determinants of UTR regulatory elements and characterize the effects of a set of "disease-relevant" UTR variants.

      Strengths:

      The use of a massively parallel reporter assay allowed for analysis of a substantial set (6,555 pairs) of 5' and 3' UTR fragments compiled from known disease associated variants. Two cell types were used.

      The findings confirm previous work about the importance of AREs, which helps show validity and adds some detailed comparisons of specific AU-rich motif effects in these two cell types.

      Using a Lasso regression, TA-dinucleotide content is identified as a strong regulator of RNA stability in a context dependent manner based on GC content and presence of RNA binding protein binding motifs. The findings have potential importance, drawing attention to a UTR feature that is not well characterized.

      The use of complementary datasets, including from half-life analyses of RNAs and from random sequence library MRPA's, is a useful addition and supports several important findings. The finding the TA dinucleotides have explanatory power separate from (and in some cases interacting with) GC content is valuable.

      The functional enrichment analysis suggests some new ideas about how UTRs may contribute to regulation of certain classes of genes.

      Weaknesses:

      It is difficult to understand how the calculations for half-life were performed. The sequencing approach measures the relative frequency of each sequence at each time point (less stable sequences become relatively less frequent after time 0, whereas more stable sequences become relatively more frequent after time 0). Since there is no discussion of whether the abundance of the transfected RNA population is referenced to some external standard (e.g., housekeeping RNAs), it is not clear how absolute (rather than relative) half-lives were determined.

      We estimated decay constant λ and half-life () by the following equations:

      where Ci(t) and Ci(t=0) are read count values of the ith replicate at time points  and  (see also Methods). The absolute abundance was not required for the half-life calculation.

      Fig. S1A and B are used to assess reproducibility. They show that read counts at a given time point correlate well across replicate experiments. However, this is not a good way to assess reproducibility or accuracy of the measurements of t1/2 are. (The major source of variability in read counts in these plots - especially at early time points - is likely the starting abundance of each RNA sequence, not stability.) This creates concerns about how well the method is measuring t1/2. Also creating concern is the observation that many RNAs are associated with half-lives that are much longer than the time points analyzed in the study. For example, based upon Figure S1 and Table S1 correctly, the median t1/2 for the 5' UTR library in HEK cells appears to be >700 minutes. Given that RNA was collected at 30, 75, and 120 minutes, accurate measurements of RNAs with such long half lives would seem to be very difficult.

      We estimated the half-life based on the following equations:

      Where Ci(t) and Ci(t=0) are read count values of the ith replicate at time points  and  (see also Methods). The calculation of the half-life involves first determining the decay constant 𝜆, which represents a constant rate of decay. Since 𝜆 is a constant, it is possible to accurately calculate it without needing data over the entire decay range. Our experimental design considers this by selecting appropriate time points to ensure a reliable estimation of 𝜆, and thus, the half-life. To determine the most suitable time points, we conducted preliminary experiments using RT-PCR. These experiments indicated that 30, 75, and 120 minutes provided an effective range for capturing the decay dynamics of the transcripts.

      There is no direct comparison of t1/2 between the two cell types studied for the full set of sequences studied. This would be helpful in understanding whether the regulatory effects of UTRs are generally similar across cell lines (as has been shown in some previous studies) or whether there are fundamental differences. The distribution of t1/2's is clearly quite different in the two cell lines, but it is important to know if this reflects generally slow RNA turnover in HEK cells or whether there are a large number of sequence-specific effects on stability between cell lines. A related issue is that it is not clear whether the relatively small number of significant variant effects detected in HEK cells versus SH-SY5Y cells is attributable to real biological differences between cell types or to technical issues (many fewer read counts and much longer half lives in HEK cells).

      For both cell lines, we selected oligonucleotides with R2 > 0.5 and mean squared error (MSE) < 1 for analysis when estimating half-life (λ) by linear regression. This selection criterion was implemented to minimize the effect of experimental noise. Additionally, we will further analyze the MSE distribution to determine if the two cell lines exhibit significantly different levels of experimental noise. We will also provide a direct comparison of half-lives between the two cell lines to assess the similarity in stability regulation.

      The general assertion is made in many places that TA dinucleotides are the most prominent destabilizing element in UTRs (e.g., in the title, the abstract, Fig. 4 legend, and on p. 12). This appears to be true for only one of the two cell lines tested based on Fig. 3.

      TA-dinucleotides and other TA-rich sequences exhibit similar effects on RNA stability, as illustrated in Fig. S5A-C. In two cell lines, TA-dinucleotide and WWWWWW sequences were representatives of the same stability-affecting cluster. While the impact of TA-dinucleotides can be generalized, we will rephrase some statements for clarification to avoid any potential misunderstanding.

      Appraisal and impact:

      The work adds to existing studies that previously identified sequence features, including AREs and other RNA binding protein motifs, that regulate stability and puts a new emphasis on the role of "TA" (better "UA") dinucleotides. It is not clear how potential problems with the RNA stability measurements discussed above might influence the overall conclusions, which may limit the impact unless these can be addressed.

      It is difficult to understand whether the importance of TA dinucleotides is best explained by their occurrence in a related set of longer RBP binding motifs (see Fig 5J, these motifs may be encompassed by the "WWWWWW cluster") or whether some other explanation applies. Further discussion of this would be helpful. Does the LASSO method tend to collapse a more diverse set of longer motifs that are each relatively rare compared to the dinucleotide? It remains unclear whether TA dinucleotides are associated with less stability independent of the presence of the known larger WWWWWWW motif. As noted above, the importance of TA dinucleotides in the HEK experiments appears to be less than is implied in the text.

      To ensure the representativeness of the features entered into the LASSO model, we pre-selected those with an occurrence greater than 10% among all UTRs. There is no evidence to support a preference for dinucleotides by LASSO. To address whether the destabilizing effect of TA dinucleotides is part of the broader WWWWWW motif, we will divide TA dinucleotides into two groups: those within the WWWWWW motif and those outside of it. We will then examine whether TA dinucleotides in these two groups exhibit the same destabilizing effect.

      The inclusion of more than a single cell type is an acknowledgement of the importance of evaluating cell type-specific effects. The work suggests a number of cell type-specific differences, but due to technical issues (especially with the HEK data, as outlined above) and the use of only two cell lines, it is difficult to understand cell type effects from the work.

      The inclusion of both 3' and 5' UTR sequences distinguishes this work from most prior studies in the field. Contrasting the effects of these regions on stability is of interest, although the role of these UTRs (especially the 5' UTR) in translational regulation is not assessed here.

      We examined the role of UTR and UTR variants in translation regulation using polysome profiling. By both univariate analysis and an elastic regression model, we identified motifs of short repeated sequences, including SRSF2 binding sites, as mutation hotspots that lead to aberrant translation. Furthermore, these polysome-shifting mutations had a considerable impact on RNA secondary structures, particularly in upstream AUG-containing 5’ UTRs. Integrating these features, our model achieved high accuracy (AUROC > 0.8) in predicting polysome-shifting mutations in the test dataset. Additionally, metagene analysis indicated that pathogenic variants were enriched at the upstream open reading frame (uORF) translation start site, suggesting changes in uORF usage underlie the translation deficiencies caused by these mutations. Illustrating this, we demonstrated that a pathogenic mutation in the IRF6 5’ UTR suppresses translation of the primary open reading frame by creating a uORF. Remarkably, site-directed ADAR editing of the mutant mRNA rescued this translation deficiency. Because the regulation of translation and stability does not converge, we illustrate these two mechanisms in two separate manuscripts (this one and doi.org/10.1101/2024.04.11.589132).

      Reviewer #3 (Public Review):

      Summary:

      In their manuscript titled "Multiplexed Assays of Human Disease‐relevant Mutations Reveal UTR Dinucleotide Composition as a Major Determinant of RNA Stability" the authors aim to investigate

      the effect of sequence variations in 3'UTR and 5'UTRs on the stability of mRNAs in two different human cell lines.

      To do so, the authors use a massively parallel reporter assay (MPRA). They transfect cells with a set of mRNA reporters that contain sequence variants in their 3' or 5' UTRs, which were previously reported in human diseases. They follow their clearance from cells over time relative to the matching non-variant sequence. To analyze their results, they define a set of factors (RBP and miRNA binding sites, sequence features, secondary structure etc.) and test their association with differences in mRNA stability. For features with a significant association, they use clustering to select a subset of factors for LASSO regression and identify factors that affect mRNA stability.

      They conclude that the TA dinucleotide content of UTRs is the strongest destabilizing sequence feature. Within that context, elevated GC content and protein binding can protect susceptible mRNAs from degradation. They also show that TA dinucleotide content of UTRs affects native mRNA stability, and that it is associated with specific functional groups. Finally, they link disease associated sequence variants with differences in mRNA stability of reporters.

      Strengths:

      (1) This work introduces a different MPRA approach to analyze the effect of genetic variants. While previous works in tissue culture use DNA transfections that require normalization for transcription efficiency, here the mRNA is directly introduced into cells at fixed amounts, allowing a more direct view of the mRNA regulation.

      (2) The authors also introduce a unique analysis approach, which takes into account multiple factors that might affect mRNA stability. This approach allows them to identify general sequence features that affect mRNA stability beyond specific genetic variants, and reach important insights on mRNA stability regulation. Indeed, while the conclusions to genetic variants identified in this work are interesting, the main strength of the work involve general effect of sequence features rather than specific variants.

      (3) The authors provide adequate supports for their claims, and validate their analysis using both their reporter data and native genes. For the main feature identified, TA di-nucleotides, they perform follow-up experiments with modified reporters that further strengthen their claims, and also validate the effect on native cellular transcripts (beyond reporters), demonstrating its validity also within native scenarios.

      (4) The work provides a broad analysis of mRNA stability, across two mRNA regulatory segments (3'UTR and 5'UTR) and is performed in two separate cell-types. Comparison between two different cell-types is adequate, and the results demonstrate, as expected, the dependence of mRNA stability on the cellular context. Analysis of 3'UTR and 5'UTR regulatory effects also shows interesting differences and similarities between these two regulatory regions.

      Weaknesses:

      (1) The authors fail to acknowledge several possible confounding factors of their MPRA approach in the discussion.

      First, while transfection of mRNA directly into cells allows to avoid the need to normalize for differences in transcription, the introduction of naked mRNA molecules is different than native cellular mRNAs and could introduce biases due to differences in mRNA modifications, protein associations etc. that may occur co-transcriptionally.

      Second, along those lines, the authors also use in-vitro polyadenylation. The length of the polyA tail of the transfected transcripts could potentially be very different than that of native mRNAs and also affect stability.

      The transcripts used in our study were polyadenylated in vitro with approximately 100 nucleotides  (Fig. S1C), similar to the polyA tail lengths typically observed in vivo  (dx.doi.org/10.1016/j.molcel.2014.02.007).  Additionally, these transcripts were capped to emulate essential mRNA characteristics and to minimize immune responses in recipient cells. This design allows us to study RNA decay for in vitro-synthesized RNA delivered into human cells, akin to RNA vaccines, but it does not necessarily extend to endogenous RNAs. As mentioned, endogenous RNAs undergo nuclear processing and are decorated by numerous trans factors, resulting in distinct regulatory mechanisms. We will provide a more in-depth discussion on these differences and their implications in the revised manuscript.

      (2) The analysis approach used in this work for identifying regulatory features in UTRs was not previously used. As such, lack of in-depth details of the methodology, and possibly also more general validation of the approach, is a drawback in convincing the reader in the validity of this approach and its results.

      In particular, a main point that is not addressed is how the authors decide on the set of "factors" used in their analysis? As choosing different sets of factors might affect the results of the analysis.

      In our study, we employed the calculation of the Variance Inflation Factor (VIF) as a basis for selecting variables. This well-established method is widely used to detect variables with high collinearity, thus ensuring the robustness and reliability of our analysis. By identifying and excluding highly collinear variables, we aimed to minimize multicollinearity and improve the accuracy of our regression models. For more detailed information on the use of VIF in regression analysis, please refer to Akinwande, M., Dikko, H., and Samson, A. (2015). Variance Inflation Factor: As a Condition for the Inclusion of Suppressor Variable(s) in Regression Analysis. Open Journal of Statistics, 5, 754-767. doi: 10.4236/ojs.2015.57075. We will include the method details in the revised manuscript.

      For example, the choice to use 7-mer sequences within the factors set is not explained, particularly when almost all motifs that are eventually identified (Figure 3B-E) are shorter.

      The known RBP motifs are primarily 6-mer. To explore the possibility of discovering novel motifs that could significantly impact our model, we started with 7-mer sequences. However, our analysis revealed that including these additional variables did not improve the explanatory power of the model; instead, it reduced it. Consequently, our final model focuses on motifs shorter than 7-mer. We will explain the motif selections in the revised manuscript.

      In addition, the authors do not perform validations to demonstrate the validity of their approach on simulated data or well-established control datasets. Such analysis would be helpful to further convince the reader in the usefulness and robustness of the analysis.

      We acknowledge the importance of validating our approach on simulated data or well-established control datasets to demonstrate its robustness and reliability. However, to the best of our knowledge, there are currently no well-established control datasets available that perfectly correspond to our specific study context. Despite this, we will continue to search for any relevant datasets that could be utilized for this purpose in future work. This effort will help to further reinforce the confidence in our methodology and its findings.

      (3) The analysis and regression models built in this work are not thoroughly investigated relative to native genes within cells. The effect of sequence "factors" on native cellular transcripts' stability is not investigated beyond TA di-nucleotides, and it is unclear to what degree do other predicted factors also affect native transcripts.

      Our system studies the stability control of RNA synthesized in vitro and delivered into human cells. While we validated the UTR TA-dinucleotide effect in vivo, we did not intend to conclude that this is the most influential regulation for endogenous RNAs. It is known that endogenous RNAs undergo very different regulation. The most prominent factors controlling endogenous RNA stability are the density of splice junctions and the length of UTRs (doi.org/10.1186/s13059-022-02811-x; doi.org/10.1186/s12915-021-00949-x). To decipher the sequence regulation, we controlled for these factors in our experiments. Therefore, we acknowledge that several endogenous features, which were excluded by our approach, may serve as predictive features of RNA half-life in vivo.

    2. eLife assessment

      This valuable study combines massively parallel reporter assays and regression analysis to identify sequence features in untranslated regions that contribute to mRNA stability. The strength of evidence presented is generally solid, but providing more details about how half lives are calculated and explaining some aspects of the subsequent choices made for analysis would clarify and strengthen the overall approach. Taken together, this study will be of interest to researchers broadly studying post-transcriptional gene regulation and also to scientists using massively parallel reporter assays.

    3. Reviewer #1 (Public Review):

      In the manuscript by Su et al., the authors present a massively parallel reporter assay (MPRA) measuring the stability of in vitro transcribed mRNAs carrying wild-type or mutant 5' or 3' UTRs transfected into two different human cell lines. The goal presented at the beginning of the manuscript was to screen for effects of disease-associated point mutations on the stability of the reporter RNAs carrying partial human 5' or 3' UTRs. However, the majority of the manuscript is dedicated to identifying sequence components underlying the differential stability of reporter constructs. This shows that TA dinucleotides are the most predictive feature of RNA stability in both cell lines and both UTRs.<br /> The effect of AU rich elements (AREs) on RNA stability is well established in multiple systems, and the present study confirms this general trend but points out variability in the consequence of seemingly similar motifs on RNA stability. For example, the authors report that a long stretch of Us has extreme opposite effects on RNA stability depending on whether it is preceded by an A (strongly destabilizing) or followed by an A (strongly stabilizing). While the authors interpretation of a context-dependence of the effect is certainly well-founded, it seems counterintuitive that the preceding or following A would be the (only) determining factor. This points to a generally reductionist approach taken by the authors in the analysis of the data and in their attempt to dissect the contribution of "AU rich sequences" to RNA stability, with a general tendency to reduce the size and complexity of the features (e.g. to dinucleotides). While this certainly increases the statistical power of the analysis due to the number of occurrences of these motifs, it limits the interpretability of the results. How do TA dinucleotides per se contribute to destabilizing the RNA, both in 5' and 3' UTRs, but (according to limited data presented) not in coding sequences? What is the mechanism? RBPs binding to TA dinucleotide containing sequences are suggested to "mask" the destabilizing effect, thereby leading to a more stable RNA. Gain of TA dinucleotides is reported to have a destabilizing effect, but again no hypothesis is provided as to the underlying molecular mechanism. In addition to reducing the motif length to dinucleotides, the notion of "context dependence" is used in a very narrow sense; especially when focusing on simple and short motifs, a more extensive analysis of the interdependence of these features (beyond the existing analysis of the relationship between TA-diNTs and GC content) could potentially reveal more of the context dependence underlying the seemingly opposite behavior of very similar motifs.

      The present MPRAs measures the effect of UTR sequences in one specific reporter context and using one experimental approach (following the decay of in vitro transcribed and transfected RNAs). While this approach certainly has its merits compared to other approaches, it also comes with some caveats: RNA is delivered naked, without bound RBPs and no nuclear history, e.g. of splicing (no EJCs), editing and modifications. One way to assess the generalizability of the results as well as the context dependence of the effects is to perform the same analysis on existing datasets of RNA stability measurements obtained through other methods (e.g. transcription inhibition). Are TA dinucleotides universally the most predictive feature of RNA half-lives?

      The authors conclude their study with a meta-analysis of genes with increased TA dinucleotides in 5' and 3'UTRs, showing that specific functional groups are overrepresented among these genes. In addition, they provide evidence for an effect of disease-associated UTR mutations on endogenous RNA stability. While these elements link back to the original motivation of the study (screening for effects of point mutations in 5' and 3' UTRs), they provide only a limited amount of additional insights.

      In summary, this manuscript presents an interesting addition to the long-standing attempts at dissecting the sequence basis of RNA stability in human cells. The analysis is in general very comprehensive and sound; however, at times the goal of the authors to find novelty and specificity in the data overshadows some analyses. One example is the case where the authors try to show that TA-dinucleotides and GC content are decoupled and not merely two sides of the same coin. They claim that the effect of TA dinucleotides is different between high- and low-GC content contexts but do not control for the fact that low GC-content regions naturally will contain more TA dinucleotides and therefore the effect sizes and the resulting correlation between TA-diNT rate and stability will be stronger (Fig. 5A). A more thorough analysis and greater caution in some of the claims could further improve the credibility of the conclusions.

    4. Reviewer #2 (Public Review):

      Summary of goals:

      Untranslated regions are key cis-regulatory elements that control mRNA stability, translation, and translocation. Through interactions with small RNAs and RNA binding proteins, UTRs form complex transcriptional circuitry that allows cells to fine-tune gene expression. Functional annotation of UTR variants has been very limited, and improvements could offer insights into disease relevant regulatory mechanisms. The goals were to advance our understanding of the determinants of UTR regulatory elements and characterize the effects of a set of "disease-relevant" UTR variants.

      Strengths:

      The use of a massively parallel reporter assay allowed for analysis of a substantial set (6,555 pairs) of 5' and 3' UTR fragments compiled from known disease associated variants. Two cell types were used.

      The findings confirm previous work about the importance of AREs, which helps show validity and adds some detailed comparisons of specific AU-rich motif effects in these two cell types.

      Using a Lasso regression, TA-dinucleotide content is identified as a strong regulator of RNA stability in a context dependent manner based on GC content and presence of RNA binding protein binding motifs. The findings have potential importance, drawing attention to a UTR feature that is not well characterized.

      The use of complementary datasets, including from half-life analyses of RNAs and from random sequence library MRPA's, is a useful addition and supports several important findings. The finding the TA dinucleotides have explanatory power separate from (and in some cases interacting with) GC content is valuable.

      The functional enrichment analysis suggests some new ideas about how UTRs may contribute to regulation of certain classes of genes.

      Weaknesses:

      It is difficult to understand how the calculations for half-life were performed. The sequencing approach measures the relative frequency of each sequence at each time point (less stable sequences become relatively less frequent after time 0, whereas more stable sequences become relatively more frequent after time 0). Since there is no discussion of whether the abundance of the transfected RNA population is referenced to some external standard (e.g., housekeeping RNAs), it is not clear how absolute (rather than relative) half-lives were determined.

      Fig. S1A and B are used to assess reproducibility. They show that read counts at a given time point correlate well across replicate experiments. However, this is not a good way to assess reproducibility or accuracy of the measurements of t1/2 are. (The major source of variability in read counts in these plots - especially at early time points - is likely the starting abundance of each RNA sequence, not stability.) This creates concerns about how well the method is measuring t1/2. Also creating concern is the observation that many RNAs are associated with half-lives that are much longer than the time points analyzed in the study. For example, based upon Figure S1 and Table S1 correctly, the median t1/2 for the 5' UTR library in HEK cells appears to be >700 minutes. Given that RNA was collected at 30, 75, and 120 minutes, accurate measurements of RNAs with such long half lives would seem to be very difficult.

      There is no direct comparison of t1/2 between the two cell types studied for the full set of sequences studied. This would be helpful in understanding whether the regulatory effects of UTRs are generally similar across cell lines (as has been shown in some previous studies) or whether there are fundamental differences. The distribution of t1/2's is clearly quite different in the two cell lines, but it is important to know if this reflects generally slow RNA turnover in HEK cells or whether there are a large number of sequence-specific effects on stability between cell lines. A related issue is that it is not clear whether the relatively small number of significant variant effects detected in HEK cells versus SH-SY5Y cells is attributable to real biological differences between cell types or to technical issues (many fewer read counts and much longer half lives in HEK cells).

      The general assertion is made in many places that TA dinucleotides are the most prominent destabilizing element in UTRs (e.g., in the title, the abstract, Fig. 4 legend, and on p. 12). This appears to be true for only one of the two cell lines tested based on Fig. 3.

      Appraisal and impact:

      The work adds to existing studies that previously identified sequence features, including AREs and other RNA binding protein motifs, that regulate stability and puts a new emphasis on the role of "TA" (better "UA") dinucleotides. It is not clear how potential problems with the RNA stability measurements discussed above might influence the overall conclusions, which may limit the impact unless these can be addressed.

      It is difficult to understand whether the importance of TA dinucleotides is best explained by their occurrence in a related set of longer RBP binding motifs (see Fig 5J, these motifs may be encompassed by the "WWWWWW cluster") or whether some other explanation applies. Further discussion of this would be helpful. Does the LASSO method tend to collapse a more diverse set of longer motifs that are each relatively rare compared to the dinucleotide? It remains unclear whether TA dinucleotides are associated with less stability independent of the presence of the known larger WWWWWWW motif. As noted above, the importance of TA dinucleotides in the HEK experiments appears to be less than is implied in the text.

      The inclusion of more than a single cell type is an acknowledgement of the importance of evaluating cell type-specific effects. The work suggests a number of cell type-specific differences, but due to technical issues (especially with the HEK data, as outlined above) and the use of only two cell lines, it is difficult to understand cell type effects from the work.

      The inclusion of both 3' and 5' UTR sequences distinguishes this work from most prior studies in the field. Contrasting the effects of these regions on stability is of interest, although the role of these UTRs (especially the 5' UTR) in translational regulation is not assessed here.

    5. Reviewer #3 (Public Review):

      Summary:

      In their manuscript titled "Multiplexed Assays of Human Disease‐relevant Mutations Reveal UTR Dinucleotide Composition as a Major Determinant of RNA Stability" the authors aim to investigate the effect of sequence variations in 3'UTR and 5'UTRs on the stability of mRNAs in two different human cell lines.

      To do so, the authors use a massively parallel reporter assay (MPRA). They transfect cells with a set of mRNA reporters that contain sequence variants in their 3' or 5' UTRs, which were previously reported in human diseases. They follow their clearance from cells over time relative to the matching non-variant sequence. To analyze their results, they define a set of factors (RBP and miRNA binding sites, sequence features, secondary structure etc.) and test their association with differences in mRNA stability. For features with a significant association, they use clustering to select a subset of factors for LASSO regression and identify factors that affect mRNA stability.<br /> They conclude that the TA dinucleotide content of UTRs is the strongest destabilizing sequence feature. Within that context, elevated GC content and protein binding can protect susceptible mRNAs from degradation. They also show that TA dinucleotide content of UTRs affects native mRNA stability, and that it is associated with specific functional groups. Finally, they link disease associated sequence variants with differences in mRNA stability of reporters.

      Strengths:

      (1) This work introduces a different MPRA approach to analyze the effect of genetic variants. While previous works in tissue culture use DNA transfections that require normalization for transcription efficiency, here the mRNA is directly introduced into cells at fixed amounts, allowing a more direct view of the mRNA regulation.

      (2) The authors also introduce a unique analysis approach, which takes into account multiple factors that might affect mRNA stability. This approach allows them to identify general sequence features that affect mRNA stability beyond specific genetic variants, and reach important insights on mRNA stability regulation. Indeed, while the conclusions to genetic variants identified in this work are interesting, the main strength of the work involve general effect of sequence features rather than specific variants.

      (3) The authors provide adequate supports for their claims, and validate their analysis using both their reporter data and native genes. For the main feature identified, TA di-nucleotides, they perform follow-up experiments with modified reporters that further strengthen their claims, and also validate the effect on native cellular transcripts (beyond reporters), demonstrating its validity also within native scenarios.

      (4) The work provides a broad analysis of mRNA stability, across two mRNA regulatory segments (3'UTR and 5'UTR) and is performed in two separate cell-types. Comparison between two different cell-types is adequate, and the results demonstrate, as expected, the dependence of mRNA stability on the cellular context. Analysis of 3'UTR and 5'UTR regulatory effects also shows interesting differences and similarities between these two regulatory regions.

      Weaknesses:

      (1) The authors fail to acknowledge several possible confounding factors of their MPRA approach in the discussion.<br /> First, while transfection of mRNA directly into cells allows to avoid the need to normalize for differences in transcription, the introduction of naked mRNA molecules is different than native cellular mRNAs and could introduce biases due to differences in mRNA modifications, protein associations etc. that may occur co-transcriptionally.<br /> Second, along those lines, the authors also use in-vitro polyadenylation. The length of the polyA tail of the transfected transcripts could potentially be very different than that of native mRNAs and also affect stability.

      (2) The analysis approach used in this work for identifying regulatory features in UTRs was not previously used. As such, lack of in-depth details of the methodology, and possibly also more general validation of the approach, is a drawback in convincing the reader in the validity of this approach and its results.<br /> In particular, a main point that is not addressed is how the authors decide on the set of "factors" used in their analysis? As choosing different sets of factors might affect the results of the analysis. For example, the choice to use 7-mer sequences within the factors set is not explained, particularly when almost all motifs that are eventually identified (Figure 3B-E) are shorter.<br /> In addition, the authors do not perform validations to demonstrate the validity of their approach on simulated data or well-established control datasets. Such analysis would be helpful to further convince the reader in the usefulness and robustness of the analysis.

      (3) The analysis and regression models built in this work are not thoroughly investigated relative to native genes within cells. The effect of sequence "factors" on native cellular transcripts' stability is not investigated beyond TA di-nucleotides, and it is unclear to what degree do other predicted factors also affect native transcripts.

    1. eLife assessment

      This fundamental study investigates the transcriptional changes in neurons that underlie loss of learning and memory with age in C. elegans, and how cognition is maintained in insulin/IGF-1-like signaling mutants. The presented evidence is compelling, utilizing a cutting-edge method to isolate neurons from worms for genomics that is clearly conveyed with a rigorous experimental approach. Overall, this study supports that older daf-2 worms maintain cognitive function via mechanisms that are unique from younger wild type worms, which will be of great interest to neuroscientists and researchers studying ageing.

    1. eLife assessment

      This important study reports a novel mechanism linking DHODH inhibition and subsequent pyrimidine nucleotide depletion with upregulation of cell surface MHC I in cancer cells. The in vitro mechanistic data are compelling, with rigorous methodology and validation across multiple cell lines. The authors also provide in vivo evidence for additive effects of DHODH inhibitors and immune checkpoint blockade. However, the in vivo assessments of the functional relevance of this mechanism remain incomplete, requiring additional analyses to fully substantiate the conclusions made.

    1. Reviewer #1 (Public Review):

      Summary:

      This study offers a new perspective. ACTL7A and ACTL7B play roles in epigenetic regulation in spermiogenesis. Actin-like 7 A (ACTL7A) is essential for acrosome formation, fertilization, and early embryo development. ACTL7A variants cause acrosome detachment responsible for male infertility and early embryonic arrest. It has been reported that ACTL7A is localized on the acrosome in mouse sperms (Boëda et al., 2011). Previous studies have identified ACTL7A mutations (c.1118G>A:p.R373H; c.1204G>A:p.G402S, c.1117C>T:p.R373C), All these variants were located in the actin domain and were predicted to be pathogenic, affecting the number of hydrogen bonds or the arrangement of nearby protein structures (Wang et al., 2023; Xin et al., 2020; Zhao et al., 2023; Zhou et al., 2023). This work used AI to model the role of ACTL7A/B in the nucleosome remodeling complex and proposed a testis-specific conformation of SCRAP complex. This is different from previous studies.

      Strengths:

      This study provides a new perspective to reveal the additional roles of these proteins.

      Weaknesses:

      The results section contains a substantial background description. However, the results and discussion sections require streamlining. There is a lack of mutual support for data between the sections, and direct data to support the authors' conclusions are missing.

    2. eLife assessment

      This valuable study reports that actin-related proteins may be involved in transcriptional regulation during spermatogenesis. The supporting data remain incomplete, and more extensive disentanglement from the canonical role of these actin-related proteins and the experimental validation of in silico predictions are required. This work will be of interest to reproductive biologists and other researchers working on non-canonical roles of actin and actin-related proteins.

    3. Reviewer #2 (Public Review):

      Summary:

      How dynamics of gene expression accompany cell fate and cellular morphological changes is important for our understanding of molecular mechanisms that govern development and diseases. The phenomenon is particularly prominent during spermatogenesis, the process which spermatogonia stem cells develop into sperm through a series of steps of cell division, differentiation, meiosis, and cellular morphogenesis. The intricacy of various aspects of cellular processes and gene expression during spermatogenesis remains to be fully understood. In this study, the authors found that testis-specific actin-related proteins (which usually participate in modifying cells' cytoskeletal systems) ACTL7A and ACTL7B were expressed and localized in the nuclei of mouse spermatocytes and spermatids. Based on this observation, the authors analyzed protein sequence conservations of ACTL7B across dozens of species and identified a putative nuclear localization sequence (NLS) that is often responsible for the nuclear import of proteins that carry them. Using molecular biology experiments in a heterologous cell system, the authors verified the potential role of this internal NLS and found it indeed could facilitate the nuclear localization of marker proteins when expressed in cells. Using gene-deleted mouse models they generated previously, the authors showed that deletion of Actl7b caused changes in gene expression and mis-localization of nucleosomal histone H3 and chromatin regulator histone deacetylase HDAC1 and 2, supporting their proposed roles of ACTL7B in regulating gene expression. The authors further used alpha-Fold 2 to model the potential protein complexes that could be formed between the ARPs (ACTL7A and ACTL7B) and known chromatin modifiers, such as INO80 and SWI/SNF complexes and found that consistent with previous findings, it is likely that ACTL7A and ACTL7B interact with the chromatin-modifying complexes through binding to their alpha-helical HSA domain cooperatively. These results suggest that ACTL7B possesses novel functions in regulating chromatin structure and thus gene expression beyond conventional roles of cytoskeleton regulation, providing alternative pathways for understanding how gene expression is regulated during spermatogenesis and the etiology of relevant infertility diseases.

      Strengths:

      The authors provided sufficient background to the study and discussions of the results. Based on their previous research, this study utilized numerous methods, including protein complex structural modeling method alpha-fold 2 Multimers, to further investigate the functional roles of ACTL7B. The results presented here are in general of good quality. The identification of a potential internal NLS in ACTL7B is mostly convincing, in line with the phenotypes presented in the gene deletion model.

      Weaknesses:

      While the study offered an interesting new look at the functions of ARP proteins during spermatogenesis, some of the study is mainly theoretical speculations, including the protein complex formation. Some of the results may need further experimental verifications, for example, differentially expressed genes that were found in potentially spermatogenic cells at different developmental stages, in order to support the conclusions and avoid undermining the significance of the study.

    4. Reviewer #3 (Public Review):

      In this manuscript, Pierre Ferrer and colleagues explore the exciting possibility that, in the male germ line, the composition and function of deeply conserved chromatin remodeling complexes is fine-tuned by the addition of testis-specific actin-related proteins (ARPs). In this regard, the Authors aim to extend previously reported non-canonical (transcriptional) roles of ARPs in somatic cells to the unique developmental context of the germ line. The manuscript is focused on the potential regulatory role in post-meiotic transcription of two ARPs: ACTL7A and ACTL7B (particularly the latter). The canonical function of both testis-specific ARPs in spermatogenesis is well established, as they have been previously shown to be required for the extensive cellular morphogenesis program driving post-meiotic development (spermiogenesis). Disentangling the actual functions of ACTL7A and ACTL7B as transcriptional regulators from their canonical role in the profound morphological reshaping of post-meiotic cells (a process that also deeply impacts nuclear architecture and regulation) represents a key challenge in terms of interpreting the reported findings (see below).

      The authors begin by documenting, via fluorescence microscopy, the intranuclear localization of ACTL7B. This ARP is convincingly shown to accumulate in the nucleus of spermatocytes and spermatids. Using a series of elegant reporter-based experiments in a somatic cell line, the authors map the driver of this nuclear accumulation to a potential NLS sequence in the ACTL7B actin-like body domain. Ferrer and colleagues then performed a testicular RNA-seq analysis in ACTL7B KO mice to define the putative role of ACTL7B in male germ cell transcription. They report substantial changes to the testicular transcriptome - particularly the upregulation of several classes of genes - in ACTL7B KO mice. However, wild-type testes were used as controls for this experiment, thus introducing a clear confounding effect to the analysis (ACTL7B KO testes have extensive post-meiotic defects due to the canonical role of ACTL7B in spermatid development). Then, the authors employ cutting-edge AI-driven approaches to predict that both ACTL7A and ACTL7B are likely to bind to four key chromatin remodeling complexes. Although these predictions are based on a robust methodology, they would certainly benefit from experimental validation. Finally, the authors associate the loss of ACTL7B with decreased lysine acetylation and lower levels of the HDAC1 and HDAC3 chromatin remodelers in the nucleus of developing spermatids.

      Globally, these data may provide important insight into the unique processes male germ cells employ to sustain their extraordinarily complex transcriptional program. Furthermore, the concept that (comparably younger) testis-specific proteins can be incorporated into ancient chromatin remodeling complexes to modulate their function in the germ line is timely and exciting.

      It is my opinion that the manuscript would benefit from additional experimental validation to better support the authors' conclusions. In particular, I believe that addressing two critical points would substantially strengthen the message of the manuscript:

      (1) The proposed role of ACTL7B in post-meiotic transcriptional regulation temporally overlaps with the protein's previously reported canonical functions in spermiogenesis (PMID: 36617158 and 37800308). Indeed, the canonical functions of ACTL7B have been shown to have a profound effect at the level of spermatid morphology and to impact nuclear organization. This potentially renders the observed transcriptional deregulation in ACTL7B KO testes an indirect consequence of spermatid morphology defects. I acknowledge that it is experimentally difficult to disentangle the proposed intranuclear roles of ACTL7B from the protein's well-documented cytoplasmic function. Perhaps the generation of a NLS-scrambled ACTL7B variant could offer some insight. In light of the substantial investment this approach would represent, I would suggest, as an alternative, that instead of using wild-type testes as controls for the transcriptome and chromatin localization assays, the authors consider the possibility of using testicular tissue from a mutant with similarly abnormal spermiogenesis but due to transcription-independent defects. This would, in my opinion, offer a more suitable baseline to compare ACTL7B KO testes with.

      (2) The manuscript would greatly benefit if experimental validation of the AI-driven predictions were to be provided (in terms of the binding capacity of ACTL7A and ACTL7B to key chromatin remodeling complexes). More so it seems that the authors have the technical expertise / available mass spectrometry data required for this purpose (lines 664-665). Still on this topic, given the predicted interactions of ACTL7A and ACTL7B with the SRCAP, EP400, SMARCA2 and SMARCA4 complexes (Figure 7), it is rather counter-intuitive that the Authors chose for their immunofluorescence assays, in ACTL7B KO testes, to determine the chromatin localization of HDAC1 and HDAC3, rather than that of any of above four complexes.

    1. eLife assessment

      This valuable study presents the design of a new device to use high-density electrophysiological probes ("Neuropixels") in freely moving rodents. The evidence showing that the system is versatile and capable of recording high-quality extracellular data in both mice and rats is compelling. This study will be of interest to neuroscientists performing chronic electrophysiological recordings.

    2. Reviewer #1 (Public Review):

      Summary:

      In this manuscript by Bimbard et al., a new method to perform stable recordings over long periods of time with neuropixels, as well as the technical details on how the electrodes can be explanted for follow-up reuse, is provided. I think the description of all parts of the method is very clear, and the validation analyses (n of units per day over time, RMS over recording days...) are very convincing. I however missed a stronger emphasis on why this could provide a big impact on the ephys community, by enabling new analyses, new behavior correlation studies, or neurophysiological mechanisms across temporal scales that were previously inaccessible with high temporal resolution (i.e. not with imaging).

      Strengths:

      Open source method. Validation across laboratories. Across species (mice and rats) demonstration of its use and in different behavioral conditions (head-fixed and freely moving).

      Weaknesses:

      Weak emphasis on what can be enabled with this new method that didn't exist before.

    3. Reviewer #2 (Public Review):

      Summary:

      This work by Bimbard et al., introduces a new implant for Neuropixels probes. While Neuropixels probes have critically improved and extended our ability to record the activity of a large number of neurons with high temporal resolution, the use of these expensive devices in chronic experiments has so far been hampered by the difficulty of safely implanting them and, importantly, to explant and reuse them after conclusion of the experiment. The authors present a newly designed two-part implant, consisting of a docking and a payload module, that allows for secure implantation and straightforward recovery of the probes. The implant is lightweight, making it amenable for use in mice and rats, and customizable. The authors provide schematics and files for printing of the implants, which can be easily modified and adapted to custom experiments by researchers with little to no design experience. Importantly, the authors demonstrate the successful use of this implant across multiple use cases, in head-fixed and freely moving experiments, in mice and rats, with different versions of Neuropixels probes, and across 8 different labs. Taken together, the presented implants promise to make chronic Neuropixel recordings and long-term studies of neuronal activity significantly easier and attainable for both current and future Neuropixels users.

      Strengths:

      - The implants have been successfully tested across 8 different laboratories, in mice and rats, in head-fixed and freely moving conditions, and have been adapted in multiple ways for a number of distinct experiments.

      - Implants are easily customizable and the authors provide a straightforward approach for customization across multiple design dimensions even for researchers not experienced in design.

      - The authors provide clear and straightforward descriptions of the construction, implantation, and explant of the described implants.

      - The split of the implant into a docking and payload module makes reuse even in different experiments (using different docking modules) easy.

      - The authors demonstrate that implants can be re-used multiple times and still allow for high-quality recordings.

      - The authors show that the chronic implantations allow for the tracking of individual neurons across days and weeks (using additional software tracking solutions), which is critical for a large number of experiments requiring the description of neuronal activity, e.g. throughout learning processes.

      - The authors show that implanted animals can even perform complex behavioral tasks, with no apparent reduction in their performance.

      Weaknesses:

      - While implanted animals can still perform complex behavioral tasks, the authors describe that the implants may reduce the animals' mobility, as measured by prolonged reaction times. However, the presented data does not allow us to judge whether this effect is specifically due to the presented implant or whether any implant or just tethering of the animals per se would have the same effects.

      - While the authors make certain comparisons to other, previously published approaches for chronic implantation and re-use of Neuropixels probes, it is hard to make conclusive comparisons and judge the advantages of the current implant. For example, while the authors emphasize that the lower weight of their implant allows them to perform recordings in mice (and is surely advantageous), the previously described, heavier implants they mention (Steinmetz et al., 2021; van Daal et al., 2021), have also been used in mice. Whether the weight difference makes a difference in practice therefore remains somewhat unclear.

      - The non-permanent integration of the headstages into the implant, while allowing for the use of the same headstage for multiple animals in parallel, requires repeated connections and does not provide strong protection for the implant. This may especially be an issue for the use in rats, requiring additional protective components as in the presented rat experiments.

    4. Reviewer #3 (Public Review):

      Summary:

      In this manuscript, Bimbard and colleagues describe a new implant apparatus called "Apollo Implant", which should facilitate recording in freely moving rodents (mice and rats) using Neuropixels probes. The authors collected data from both mice and rats, they used 3 different versions of Neuropixels, multiple labs have already adopted this method, which is impressive. They openly share their CAD designs and surgery protocol to further facilitate the adaptation of their method.

      Strengths:

      Overall, the "Apollo Implant" is easy to use and adapt, as it has been used in other laboratories successfully and custom modifications are already available. The device is reproducible using common 3D printing services and can be easily modified thanks to its CAD design (the video explaining this is extremely helpful). The weight and price are amazing compared to other systems for rigid silicon probes allowing a wide range of use of the "Apollo Implant".

      Weaknesses:

      The "Apollo Implant" can only handle Neuropixels probes. It cannot hold other widely used and commercially available silicon probes. Certain angles and distances are not possible in their current form (distance between probes 1.8 to 4mm, implantation depth 2-6.5 mm, or angle of insertion up to 20 degrees).

    5. Author response:

      Reviewer 1:

      Summary:

      In this manuscript by Bimbard et al., a new method to perform stable recordings over long periods of time with neuropixels, as well as the technical details on how the electrodes can be explanted for follow-up reuse, is provided. I think the description of all parts of the method is very clear, and the validation analyses (n of units per day over time, RMS over recording days...) are very convincing. I however missed a stronger emphasis on why this could provide a big impact on the ephys community, by enabling new analyses, new behavior correlation studies, or neurophysiological mechanisms across temporal scales

      Strengths:

      Open source method. Validation across laboratories. Across species (mice and rats) demonstration of its use and in different behavioral conditions (head-fixed and freely moving).

      Weaknesses:

      Weak emphasis on what can be enabled with this new method that didn't exist before.

      We thank the reviewer for highlighting the limited discussion around scientific impact. Our implant has several advantages which combine to make it much more accessible than previous solutions. This enables a variety of recording configurations that would not have been possible with previous designs, facilitating recordings from a wider range of brain regions, animals, and experimental setups. In short, there are three key advances:

      (1) Adaptability: The CAD files can be readily adapted to a wide range of configurations (implantation depth, angle, position of headstage, etc.). Labs have already, modified the design to optimise for their needs, and re-shared with the community.

      (2) Weight:  Because of the lightweight design, experimenters can i) perform complex and demanding freely moving tasks as we exemplify in the manuscript, and ii) implant female and water restricted mice while respecting animal welfare weight limitations.

      (3) Cost: At ~$10, our implant is significantly cheaper than published alternatives, which makes it affordable to more labs and means that testing modifications is cost-effective.

      We will make these features clearer in the manuscript.

      Reviewer 2:

      Summary:

      This work by Bimbard et al., introduces a new implant for Neuropixels probes. While Neuropixels probes have critically improved and extended our ability to record the activity of a large number of neurons with high temporal resolution, the use of these expensive devices in chronic experiments has so far been hampered by the difficulty of safely implanting them and, importantly, to explant and reuse them after conclusion of the experiment. The authors present a newly designed two-part implant, consisting of a docking and a payload module, that allows for secure implantation and straightforward recovery of the probes. The implant is lightweight, making it amenable for use in mice and rats, and customizable. The authors provide schematics and files for printing of the implants, which can be easily modified and adapted to custom experiments by researchers with little to no design experience. Importantly, the authors demonstrate the successful use of this implant across multiple use cases, in head-fixed and freely moving experiments, in mice and rats, with different versions of Neuropixels probes, and across 8 different labs. Taken together, the presented implants promise to make chronic Neuropixel recordings and long-term studies of neuronal activity significantly easier and attainable for both current and future Neuropixels users.

      Strengths:

      - The implants have been successfully tested across 8 different laboratories, in mice and rats, in head-fixed and freely moving conditions, and have been adapted in multiple ways for a number of distinct experiments.

      - Implants are easily customizable and the authors provide a straightforward approach for customization across multiple design dimensions even for researchers not experienced in design.

      - The authors provide clear and straightforward descriptions of the construction, implantation, and explant of the described implants.

      - The split of the implant into a docking and payload module makes reuse even in different experiments (using different docking modules) easy.

      - The authors demonstrate that implants can be re-used multiple times and still allow for high-quality recordings.

      - The authors show that the chronic implantations allow for the tracking of individual neurons across days and weeks (using additional software tracking solutions), which is critical for a large number of experiments requiring the description of neuronal activity, e.g. throughout learning processes.

      - The authors show that implanted animals can even perform complex behavioral tasks, with no apparent reduction in their performance.

      Weaknesses:

      - While implanted animals can still perform complex behavioral tasks, the authors describe that the implants may reduce the animals' mobility, as measured by prolonged reaction times. However, the presented data does not allow us to judge whether this effect is specifically due to the presented implant or whether any implant or just tethering of the animals per se would have the same effects.

      The reviewer is correct: some of the differences in mouse reaction time could be due to the tether rather than the implant. As these experiments were also performed in water-restricted female mice with the heavier Neuropixels 1.0 implant, our data represent the maximal impact of the implant, and we will highlight this in the revision.

      - While the authors make certain comparisons to other, previously published approaches for chronic implantation and re-use of Neuropixels probes, it is hard to make conclusive comparisons and judge the advantages of the current implant. For example, while the authors emphasize that the lower weight of their implant allows them to perform recordings in mice (and is surely advantageous), the previously described, heavier implants they mention (Steinmetz et al., 2021; van Daal et al., 2021), have also been used in mice. Whether the weight difference makes a difference in practice therefore remains somewhat unclear.

      The reviewer is correct: without a direct comparison, we cannot be certain that our smaller, lighter implant improves behavioural results (although this is supported by the literature, e.g. Newman et al, 2023). However, the reduced weight of our implant is critical for several laboratories represented in this manuscript due to animal welfare requirements. Indeed, in Daal et al the authors “recommend a [mouse] weight of >25 g for implanting Neuropixels 1.0 probes.” This limit precludes using (the vast majority of) female mice, or water-restricted animals. Conversely, our implant can be routinely used with lighter, water-restricted male and female mice. We will emphasise this point in the revision.

      - The non-permanent integration of the headstages into the implant, while allowing for the use of the same headstage for multiple animals in parallel, requires repeated connections and does not provide strong protection for the implant. This may especially be an issue for the use in rats, requiring additional protective components as in the presented rat experiments.

      We apologise for not clarifying the various headstage options in the manuscript and we will address this in the revision. Our repository has headplate holder designs (in the XtraModifications/Mouse_FreelyMoving folder). This allows leaving the headstage on the implant, and thus minimize the number of connections (albeit increasing the weight for the mouse). Indeed, mice recorded while performing the task described in our manuscript had the head-stage semi-permanently integrated to the implant, and we will highlight this in the revision.

      Reviewer 3:

      Summary:

      In this manuscript, Bimbard and colleagues describe a new implant apparatus called "Apollo Implant", which should facilitate recording in freely moving rodents (mice and rats) using Neuropixels probes. The authors collected data from both mice and rats, they used 3 different versions of Neuropixels, multiple labs have already adopted this method, which is impressive. They openly share their CAD designs and surgery protocol to further facilitate the adaptation of their method.

      Strengths:

      Overall, the "Apollo Implant" is easy to use and adapt, as it has been used in other laboratories successfully and custom modifications are already available. The device is reproducible using common 3D printing services and can be easily modified thanks to its CAD design (the video explaining this is extremely helpful). The weight and price are amazing compared to other systems for rigid silicon probes allowing a wide range of use of the "Apollo Implant".

      Weaknesses:

      The "Apollo Implant" can only handle Neuropixels probes. It cannot hold other widely used and commercially available silicon probes. Certain angles and distances are not possible in their current form (distance between probes 1.8 to 4mm, implantation depth 2-6.5 mm, or angle of insertion up to 20 degrees).

      We appreciate the reviewer’s points, but as we will discuss in the revised manuscript, one implant accommodating the diversity of the existing probes is beyond the scope of this project. However, because the design is adaptable, groups should be able to modify the current version of the implant to adapt to their electrodes’ size and format (and can highlight any issues in the Github “Discussions” area).

      With Neuropixels, the current range of depths covers practically all trajectories in the mouse brain. In rats, where deeper penetrations may be useful, the experimenter can attach the probe at a lower point in the payload module to increase the length of exposed shank. We now specify this in the Github repository.

      We have now extended the range of inter-probe distances from a maximum of 4 mm to 6.5 mm, and this will be reflected in the revised manuscript. Distances beyond this may be better served by 2 implants, and smaller distances could be achieved by attaching two probes on the same side of the docking module. In the next revision, we will add these points to the discussion.

    1. Author response:

      eLife assessment

      This study is a detailed investigation of how chromatin structure influences replication origin function in yeast ribosomal DNA, with focus on the role of the histone deacetylase Sir2 and the chromatin remodeler Fun30. Convincing evidence shows that Sir2 does not affect origin licensing but rather affects local transcription and nucleosome positioning which correlates with increased origin firing. However, the evidence remains incomplete as the methods employed do not rigorously establish a key aspect of the mechanism, fully address some alternative models, or sufficiently relate to prior results. Overall, this is a valuable advance for the field that could be improved to establish a more robust paradigm.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This paper presents a mechanistic study of rDNA origin regulation in yeast by SIR2. Each of the ~180 tandemly repeated rDNA gene copies contains a potential replication origin. Early-efficient initiation of these origins is suppressed by Sir2, reducing competition with origins distributed throughout the genome for rate-limiting initiation factors. Previous studies by these authors showed that SIR2 deletion advances replication timing of rDNA origins by a complex mechanism of transcriptional de-repression of a local PolII promoter causing licensed origin proteins (MCMcomplexes) to re-localize (slide along the DNA) to a different (and altered) chromatin environment. In this study, they identify a chromatin remodeler, FUN30, that suppresses the sir2∆ effect, and remarkably, results in a contraction of the rDNA to about one-quarter it's normal length/number of repeats, implicating replication defects of the rDNA. Through examination of replication timing, MCM occupancy and nucleosome occupancy on the chromatin in sir2, fun30, and double mutants, they propose a model where nucleosome position relative to the licensed origin (MCM complexes) intrinsically determines origin timing/efficiency. While their interpretations of the data are largely reasonable and can be interpreted to support their model, a key weakness is the connection between Mcm ChEC signal disappearance and origin firing. While the cyclical chromatin association-dissociation of MCM proteins with potential origin sequences may be generally interpreted as licensing followed by firing, dissociation may also result from passive replication and as shown here, displacement by transcription and/or chromatin remodeling.

      While it is true that both transcription and passive replication can cause the signal of MCM-ChEC to disappear, neither can cause selective disappearance of the displaced complex without affecting the non-displaced complex.  Indeed, in the case of transcription, RNA polymerase transcribing C-pro would have to first dislodge the normally positioned MCM complex before even reaching the displaced complex.  Furthermore, deletion of FUN30 leads to both more C-pro transcription and less disappearance of the displaced MCM complex.  It is important to keep in mind that this cannot somehow reflect continuous replenishment of displaced MCMs with newly loaded MCMs, since the cells are in S phase and licensing is restricted to G1. 

      Moreover, linking its disappearance from chromatin in the ChEC method with such precise resolution needs to be validated against an independent method to determine the initiation site(s). Differences in rDNA copy number and relative transcription levels also are not directly accounted for, obscuring a clearer interpretation of the results.

      Copy number reduction of the magnitude caused by deletion of SIR2 and FUN30 does not suppress the sir2D effect (i.e. early replication of the rDNA), but rather exacerbates it.  In particular, deletion of SIR2 and FUN30 causes the rDNA to shrink to approximately 35 copies.  Kwan et al., 2023 (PMID: 36842087) have shown that reduction of rDNA copy number to 35 causes a dramatic acceleration of rDNA replication in a SIR2 strain.  Thus, the effect of rDNA size on replication timing reinforces our conclusion that deletion of FUN30 suppresses rDNA replication.

      However, to address this concern directly, in the revision we will include 2 D gels in fob1 strains with equal number of repeats that allows to conclude that the effect of FUN30 deletion in suppressing rDNA origin firing is independent of either rDNA size or FOB1. The figure of the critical 2 D gels is shown below in the reply to reviewer 2.

      Nevertheless, this paper makes a valuable advance with the finding of Fun30 involvement, which substantially reduces rDNA repeat number in sir2∆ background. The model they develop is compelling and I am inclined to agree, but I think the evidence on this specific point is purely correlative and a better method is needed to address the initiation site question. The authors deserve credit for their efforts to elucidate our obscure understanding of the intricacies of chromatin regulation. At a minimum, I suggest their conclusions on these points of concern should be softened and caveats discussed. Statistical analysis is lacking for some claims.

      Strengths are the identification of FUN30 as suppressor, examination of specific mutants of FUN30 to distinguish likely functional involvement. Use of multiple methods to analyze replication and protein occupancies on chromatin. Development of a coherent model.

      Weaknesses are failure to address copy number as a variable; insufficient validation of ChEC method relationship to exact initiation locus; lack of statistical analysis in some cases. 

      The two potential initiation sites that one would monitor (non-displaced and displaced) are separated by less than 150 base pairs, and other techniques simply do not have the resolution necessary to distinguish such differences.  Furthermore, as we suggest in the manuscript, our results are consistent with a model in which it is only the displaced MCM complex that is activated, whether in sir2 or WT.  If no genotype-dependent difference in initiation sites is even expected, it would be hard to interpret even the most precise replication-based assays.  However, the reviewer is correct that this is a novel technique and that confirmation with a well-established technique is comforting, therefore we are performing ChIP experiments to corroborate, to the extent possible, the conclusions that we reached with ChEC. 

      We appreciate the reviewer pointing out that some statistical analyses were lacking, and we will correct this in a revised manuscript.

      Additional background and discussion for public review:

      This paper broadly addresses the mechanism(s) that regulate replication origin firing in different chromatin contexts. The rDNA origin is present in each of ~180 tandem repeats of the rDNA sequence, representing a high potential origin density per length of DNA (9.1kb repeat unit). However, the average origin efficiency of rDNA origins is relatively low (~20% in wild-type cells), which reduces the replication load on the overall genome by reducing competition with origins throughout the genome for limiting replication initiation factors. Deletion of histone deacetylase SIR2, which silences PolII transcription within the rDNA, results in increased early activation or the rDNA origins (and reduced rate of overall genome replication). Previous work by the authors showed that MCM complexes loaded onto the rDNA origins (origin licensing) were laterally displaced (sliding) along the rDNA, away from a well-positioned nucleosome on one side. The authors' major hypothesis throughout this work is that the new MCM location(s) are intrinsically more efficient configurations for origin firing. The authors identify a chromatin remodeling enzyme, FUN30, whose deletion appears to suppress the earlier activation of rDNA origins in sir2∆ cells. Indeed, it appears that the reduction of rDNA origin activity in sir2∆ fun30∆ cells is severe enough to results in a substantial reduction in the rDNA array repeat length (number of repeats); the reduced rDNA length presumably facilitates it's more stable replication and maintenance.

      Analysis of replication by 2D gels is marginally convincing, using 2D gels for this purpose is very challenging and tricky to quantify. The more quantitative analysis by EdU incorporation is more convincing of the suppression of the earlier replication caused by SIR2 deletion.

      To address the mechanism of suppression, they analyze MCM positioning using ChEC, which in G1 cells shows partial displacement of MCM from normal position A to positions B and C in sir2∆ cells and similar but more complete displacement away from A to positions B and C in sir2fun30 cells. During S-phase in the presence of hydroxyurea, which slows replication progression considerably (and blocks later origin firing) MCM signals redistribute, which is interpreted to represent origin firing and bidirectional movement of MCMs (only one direction is shown), some of which accumulate near the replication fork barrier, consistent with their interpretation. They observe that MCMs displaced (in G1) to sites B or C in sir2∆ cells, disappear more rapidly during S-phase, whereas the similar dynamic is not observed in sir2∆fun30∆. This is the main basis for their conclusion that the B and C sites are more permissive than A. While this may be the simplest interpretation, there are limitations with this assay that undermine a rigorous conclusion (additional points below). The main problem is that we know the MCM complexes are mobile so disappearance may reflect displacement by other means including transcription which is high is the sir2∆ background. Indeed, the double mutant has greater level of transcription per repeat unit which might explain more displaced from A in G1. Thus, displacement might not always represent origin firing. Because the sir2 background profoundly changes transcription, and the double mutant has a much smaller array length associated with higher transcription, how can we rule out greater accessibility at site A, for example in sir2∆, leading to more firing, which is suppressed in sir2 fun30 due to greater MCM displacement away from A?

      I think the critical missing data to solidly support their conclusions is a definitive determination of the site(s) of initiation using a more direct method, such as strand specific sequencing of EdU or nascent strand analysis. More direct comparisons of the strains with lower copy number to rule out this facet. As discussed in detail below, copy number reduction is known to suppress at least part of the sir2∆ effect so this looms over the interpretations. I think they are probably correct in their overall model based on the simplest interpretation of the data but I think it remains to be rigorously established. I think they should soften their conclusions in this respect.

      Reviewer #2 (Public Review):

      Summary:

      In this manuscript, the authors follow up on their previous work showing that in the absence of the Sir2 deacetylase the MCM replicative helicase at the rDNA spacer region is repositioned to a region of low nucleosome occupancy. Here they show that the repositioned displaced MCMs have increased firing propensity relative to non-displaced MCMs. In addition, they show that activation of the repositioned MCMs and low nucleosome occupancy in the adjacent region depend on the chromatin remodeling activity of Fun30.

      Strengths:

      The paper provides new information on the role of a conserved chromatin remodeling protein in the regulation of origin firing and in addition provides evidence that not all loaded MCMs fire and that origin firing is regulated at a step downstream of MCM loading.

      Weaknesses:

      The relationship between the author's results and prior work on the role of Sir2 (and Fob1) in regulation of rDNA recombination and copy number maintenance is not explored, making it difficult to place the results in a broader context. Sir2 has previously been shown to be recruited by Fob1, which is also required for DSB formation and recombination-mediated changes in rDNA copy number. Are the changes that the authors observe specifically in fun30 sir2 cells related to this pathway? Is Fob1 required for the reduced rDNA copy number in fun30 sir2 double mutant cells? 

      Strains lacking SIR2 have unstable rDNA size, and FOB1 deletion stabilizes rDNA size in sir2 background. Likewise, FOB1 deletion influences the kinetics  rDNA size reduction in sir2 fun30 cells. However, the main effect of Fun30 in sir2 cells we were interested in, suppression of rDNA replication, is preserved in fob1 background, arguing that the observed effect is independent of Fob1 (see figure below). Given that the main focus of the paper is regulation of rDNA origins activity and that these changes were independent of Fob1, we had elected not to include these results in the original manuscript but will gladly include them in the revision.

      Besides refuting the possible role of Fob1 in the FUN30-mediated activation of rDNA origin firing in sir2 cells, the use of fob1 background enabled us compare the activation of rDNA origins in the sir2 and sir2 fun30 strains with equally short rDNA size. The 2-D gels demonstrate a dramatic suppression of rDNA origin activity upon deletion of FUN30 in the sir2 fob1 strains with 35 rDNA copies.

      Author response image 1.

      The deletion of FUN30 diminishes the replication bubble signal in a fob1 sir2 strain with 35 rDNA copies by more than tenfold. The single rARS signal, marked with the arrow, originates from the rightmost rDNA repeat. This specific rightmost rDNA NheI fragment is approximately 25 kb in size, distinctly larger than the 4.7 kb NheI 1N rARS-containing fragments that originate from the internal rDNA repeats.

      Reviewer #3 (Public Review):

      Summary:

      Heterochromatin is characterized by low transcription activity and late replication timing, both dependent on the NAD-dependent protein deacetylase Sir2, the founding member of the sirtuins. This manuscript addresses the mechanism by which Sir2 delays replication timing at the rDNA in budding yeast. Previous work from the same laboratory (Foss et al. PLoS Genetics 15, e1008138) showed that Sir2 represses transcription-dependent displacement of the Mcm helicase in the rDNA. In this manuscript, the authors show convincingly that the repositioned Mcms fire earlier and that this early firing partly depends on the ATPase activity of the nucleosome remodeler Fun30. Using read-depth analysis of sorted G1/S cells, fun30 was the only chromatin remodeler mutant that somewhat delayed replication timing in sir2 mutants, while nhp10, chd1, isw1, htl1, swr1, isw2, and irc5 had not effect. The conclusion was corroborated with orthogonal assays including two-dimensional gel electrophoresis and analysis of EdU incorporation at early origins. Using an insightful analysis with an Mcm-MNase fusion (Mcm-ChEC), the authors show that the repositioned Mcms in sir2 mutants fire earlier than the Mcm at the normal position in wild type. This early firing at the repositioned Mcms is partially suppressed by Fun30. In addition, the authors show Fun30 affects nucleosome occupancy at the sites of the repositioned Mcm, providing a plausible mechanism for the effect of Fun30 on Mcm firing at that position. However, the results from the MNAse-seq and ChEC-seq assays are not fully congruent for the fun30 single mutant. Overall, the results support the conclusions providing a much better mechanistic understanding how Sir2 affects replication timing at rDNA.

      The reason that the results for the fun30 single mutant appear incongruent, with a larger signal of the +2 nucleosome in the MNase-seq plot but a negligible signal in the ChEC-seq plot is the paucity of displaced Mcm in the fun30 single mutant. Given the relative absence of displaced MCMs, the MCM-MNase fusion protein can't "light up" the +2 nucleosome.  We will comment on this in the revision to clarify this. 

      Strengths

      (1) The data clearly show that the repositioned Mcm helicase fires earlier than the Mcm in the wild type position.

      (2) The study identifies a specific role for Fun30 in replication timing and an effect on nucleosome occupancy around the newly positioned Mcm helicase in sir2 cells.

      Weaknesses

      (1) It is unclear which strains were used in each experiment.

      (2) The relevance of the fun30 phospho-site mutant (S20AS28A) is unclear.

      (3) For some experiments (Figs. 3, 4, 6) it is unclear whether the data are reproducible and the differences significant. Information about the number of independent experiments and quantitation is lacking. This affects the interpretation, as fun30 seems to affect the +3 nucleosome much more than let on in the description.

      We appreciate the reviewer pointing out places in which our manuscript omitted key pieces of information (items 1 and 3), and we will fix these oversights in our revision. 

      With regard to point 2, we had written: 

      “Fun30 is also known to play a role in the DNA damage response; specifically, phosphorylation of Fun30 on S20 and S28 by CDK1 targets Fun30 to sites of DNA damage, where it promotes DNA resection (Chen et al. 2016; Bantele et al. 2017). To determine whether the replication phenotype that we observed might be a consequence of Fun30's role in the DNA damage response, we tested non-phosphorylatable mutants for the ability to suppress early replication of the rDNA in sir2; these mutations had no effect on the replication phenotype (Figure 2B), arguing against a primary role for Fun30

      in DNA damage repair that somehow manifests itself in replication.”

      We will expand on this to clarify our point in the revision.

    2. eLife assessment

      This study is a detailed investigation of how chromatin structure influences replication origin function in yeast ribosomal DNA, with focus on the role of the histone deacetylase Sir2 and the chromatin remodeler Fun30. Convincing evidence shows that Sir2 does not affect origin licensing but rather affects local transcription and nucleosome positioning which correlates with increased origin firing. However, the evidence remains incomplete as the methods employed do not rigorously establish a key aspect of the mechanism, fully address some alternative models, or sufficiently relate to prior results. Overall, this is a valuable advance for the field that could be improved to establish a more robust paradigm.

    3. Reviewer #1 (Public Review):

      Summary:

      This paper presents a mechanistic study of rDNA origin regulation in yeast by SIR2. Each of the ~180 tandemly repeated rDNA gene copies contains a potential replication origin. Early-efficient initiation of these origins is suppressed by Sir2, reducing competition with origins distributed throughout the genome for rate-limiting initiation factors. Previous studies by these authors showed that SIR2 deletion advances replication timing of rDNA origins by a complex mechanism of transcriptional de-repression of a local PolII promoter causing licensed origin proteins (MCMcomplexes) to re-localize (slide along the DNA) to a different (and altered) chromatin environment. In this study, they identify a chromatin remodeler, FUN30, that suppresses the sir2∆ effect, and remarkably, results in a contraction of the rDNA to about one-quarter it's normal length/number of repeats, implicating replication defects of the rDNA. Through examination of replication timing, MCM occupancy and nucleosome occupancy on the chromatin in sir2, fun30, and double mutants, they propose a model where nucleosome position relative to the licensed origin (MCM complexes) intrinsically determines origin timing/efficiency. While their interpretations of the data are largely reasonable and can be interpreted to support their model, a key weakness is the connection between Mcm ChEC signal disappearance and origin firing. While the cyclical chromatin association-dissociation of MCM proteins with potential origin sequences may be generally interpreted as licensing followed by firing, dissociation may also result from passive replication and as shown here, displacement by transcription and/or chromatin remodeling. Moreover, linking its disappearance from chromatin in the ChEC method with such precise resolution needs to be validated against an independent method to determine the initiation site(s). Differences in rDNA copy number and relative transcription levels also are not directly accounted for, obscuring a clearer interpretation of the results. Nevertheless, this paper makes a valuable advance with the finding of Fun30 involvement, which substantially reduces rDNA repeat number in sir2∆ background. The model they develop is compelling and I am inclined to agree, but I think the evidence on this specific point is purely correlative and a better method is needed to address the initiation site question. The authors deserve credit for their efforts to elucidate our obscure understanding of the intricacies of chromatin regulation. At a minimum, I suggest their conclusions on these points of concern should be softened and caveats discussed. Statistical analysis is lacking for some claims.

      Strengths are the identification of FUN30 as suppressor, examination of specific mutants of FUN30 to distinguish likely functional involvement. Use of multiple methods to analyze replication and protein occupancies on chromatin. Development of a coherent model.

      Weaknesses are failure to address copy number as a variable; insufficient validation of ChEC method relationship to exact initiation locus; lack of statistical analysis in some cases.

      Additional background and discussion for public review:

      This paper broadly addresses the mechanism(s) that regulate replication origin firing in different chromatin contexts. The rDNA origin is present in each of ~180 tandem repeats of the rDNA sequence, representing a high potential origin density per length of DNA (9.1kb repeat unit). However, the average origin efficiency of rDNA origins is relatively low (~20% in wild-type cells), which reduces the replication load on the overall genome by reducing competition with origins throughout the genome for limiting replication initiation factors. Deletion of histone deacetylase SIR2, which silences PolII transcription within the rDNA, results in increased early activation or the rDNA origins (and reduced rate of overall genome replication). Previous work by the authors showed that MCM complexes loaded onto the rDNA origins (origin licensing) were laterally displaced (sliding) along the rDNA, away from a well-positioned nucleosome on one side. The authors' major hypothesis throughout this work is that the new MCM location(s) are intrinsically more efficient configurations for origin firing. The authors identify a chromatin remodeling enzyme, FUN30, whose deletion appears to suppress the earlier activation of rDNA origins in sir2∆ cells. Indeed, it appears that the reduction of rDNA origin activity in sir2∆ fun30∆ cells is severe enough to results in a substantial reduction in the rDNA array repeat length (number of repeats); the reduced rDNA length presumably facilitates it's more stable replication and maintenance.

      Analysis of replication by 2D gels is marginally convincing, using 2D gels for this purpose is very challenging and tricky to quantify. The more quantitative analysis by EdU incorporation is more convincing of the suppression of the earlier replication caused by SIR2 deletion.

      To address the mechanism of suppression, they analyze MCM positioning using ChEC, which in G1 cells shows partial displacement of MCM from normal position A to positions B and C in sir2∆ cells and similar but more complete displacement away from A to positions B and C in sir2fun30 cells. During S-phase in the presence of hydroxyurea, which slows replication progression considerably (and blocks later origin firing) MCM signals redistribute, which is interpreted to represent origin firing and bidirectional movement of MCMs (only one direction is shown), some of which accumulate near the replication fork barrier, consistent with their interpretation. They observe that MCMs displaced (in G1) to sites B or C in sir2∆ cells, disappear more rapidly during S-phase, whereas the similar dynamic is not observed in sir2∆fun30∆. This is the main basis for their conclusion that the B and C sites are more permissive than A. While this may be the simplest interpretation, there are limitations with this assay that undermine a rigorous conclusion (additional points below). The main problem is that we know the MCM complexes are mobile so disappearance may reflect displacement by other means including transcription which is high is the sir2∆ background. Indeed, the double mutant has greater level of transcription per repeat unit which might explain more displaced from A in G1. Thus, displacement might not always represent origin firing. Because the sir2 background profoundly changes transcription, and the double mutant has a much smaller array length associated with higher transcription, how can we rule out greater accessibility at site A, for example in sir2∆, leading to more firing, which is suppressed in sir2 fun30 due to greater MCM displacement away from A?

      I think the critical missing data to solidly support their conclusions is a definitive determination of the site(s) of initiation using a more direct method, such as strand specific sequencing of EdU or nascent strand analysis. More direct comparisons of the strains with lower copy number to rule out this facet. As discussed in detail below, copy number reduction is known to suppress at least part of the sir2∆ effect so this looms over the interpretations. I think they are probably correct in their overall model based on the simplest interpretation of the data but I think it remains to be rigorously established. I think they should soften their conclusions in this respect.

    4. Reviewer #2 (Public Review):

      Summary:

      In this manuscript, the authors follow up on their previous work showing that in the absence of the Sir2 deacetylase the MCM replicative helicase at the rDNA spacer region is repositioned to a region of low nucleosome occupancy. Here they show that the repositioned displaced MCMs have increased firing propensity relative to non-displaced MCMs. In addition, they show that activation of the repositioned MCMs and low nucleosome occupancy in the adjacent region depend on the chromatin remodeling activity of Fun30.

      Strengths:

      The paper provides new information on the role of a conserved chromatin remodeling protein in the regulation of origin firing and in addition provides evidence that not all loaded MCMs fire and that origin firing is regulated at a step downstream of MCM loading.

      Weaknesses:

      The relationship between the author's results and prior work on the role of Sir2 (and Fob1) in regulation of rDNA recombination and copy number maintenance is not explored, making it difficult to place the results in a broader context. Sir2 has previously been shown to be recruited by Fob1, which is also required for DSB formation and recombination-mediated changes in rDNA copy number. Are the changes that the authors observe specifically in fun30 sir2 cells related to this pathway? Is Fob1 required for the reduced rDNA copy number in fun30 sir2 double mutant cells?

    5. Reviewer #3 (Public Review):

      Summary:

      Heterochromatin is characterized by low transcription activity and late replication timing, both dependent on the NAD-dependent protein deacetylase Sir2, the founding member of the sirtuins. This manuscript addresses the mechanism by which Sir2 delays replication timing at the rDNA in budding yeast. Previous work from the same laboratory (Foss et al. PLoS Genetics 15, e1008138) showed that Sir2 represses transcription-dependent displacement of the Mcm helicase in the rDNA. In this manuscript, the authors show convincingly that the repositioned Mcms fire earlier and that this early firing partly depends on the ATPase activity of the nucleosome remodeler Fun30. Using read-depth analysis of sorted G1/S cells, fun30 was the only chromatin remodeler mutant that somewhat delayed replication timing in sir2 mutants, while nhp10, chd1, isw1, htl1, swr1, isw2, and irc5 had not effect. The conclusion was corroborated with orthogonal assays including two-dimensional gel electrophoresis and analysis of EdU incorporation at early origins. Using an insightful analysis with an Mcm-MNase fusion (Mcm-ChEC), the authors show that the repositioned Mcms in sir2 mutants fire earlier than the Mcm at the normal position in wild type. This early firing at the repositioned Mcms is partially suppressed by Fun30. In addition, the authors show Fun30 affects nucleosome occupancy at the sites of the repositioned Mcm, providing a plausible mechanism for the effect of Fun30 on Mcm firing at that position. However, the results from the MNAse-seq and ChEC-seq assays are not fully congruent for the fun30 single mutant. Overall, the results support the conclusions providing a much better mechanistic understanding how Sir2 affects replication timing at rDNA,

      Strengths

      (1) The data clearly show that the repositioned Mcm helicase fires earlier than the Mcm in the wild type position.<br /> (2) The study identifies a specific role for Fun30 in replication timing and an effect on nucleosome occupancy around the newly positioned Mcm helicase in sir2 cells.

      Weaknesses

      (1) It is unclear which strains were used in each experiment.<br /> (2) The relevance of the fun30 phospho-site mutant (S20AS28A) is unclear.<br /> (3) For some experiments (Figs. 3, 4, 6) it is unclear whether the data are reproducible and the differences significant. Information about the number of independent experiments and quantitation is lacking. This affects the interpretation, as fun30 seems to affect the +3 nucleosome much more than let on in the description.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      The authors report that optogenetic inhibition of hippocampal axon terminals in retrosplenial cortex impairs the performance of a delayed non-match to place task. The significance of findings elucidating the role of hippocampal projections to the retrosplenial cortex in memory and decision-making behaviors is important. However, the strength of evidence for the paper's claims is currently incomplete.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This is a study on the role of the retrosplenial cortex (RSC) and the hippocampus in working memory. Working memory is a critical cognitive function that allows temporary retention of information for task execution. The RSC, which is functionally and anatomically connected to both primary sensory (especially visual) and higher cognitive areas, plays a key role in integrating spatial-temporal context and in goal-directed behaviors. However, the specific contributions of the RSC and the hippocampus in working memory-guided behaviors are not fully understood due to a lack of studies that experimentally disrupt the connection between these two regions during such behaviors.

      In this study, researchers employed eArch3.0 to silence hippocampal axon terminals in the RSC, aiming to explore the roles of these brain regions in working memory. Experiments were conducted where animals with silenced hippocampal axon terminals in the RSC performed a delayed non-match to place (DNMP) task. The results indicated that this manipulation impaired memory retrieval, leading to decreased performance and quicker decision-making in the animals. Notably, the authors observed that the effects of this impairment persisted beyond the light-activation period of the opsin, affecting up to three subsequent trials. They suggest that disrupting the hippocampal-RSC connection has a significant and lasting impact on working memory performance.

      Strengths:

      They conducted a study exploring the impact of direct hippocampal inputs into the RSC, a region involved in encoding spatial-temporal context and transferring contextual information, on spatial working memory tasks. Utilizing eArch3.0 expressed in hippocampal neurons via the viral vector AAV5-hSyn1-eArch3.0, they aimed to bilaterally silence hippocampal terminals located at the RSC in rats pre-trained in a DNMP task. They discovered that silencing hippocampal terminals in the RSC significantly decreased working memory performance in eArch+ animals, especially during task interleaving sessions (TI) that alternated between trials with and without light delivery. This effect persisted even in non-illuminated trials, indicating a lasting impact beyond the periods of direct manipulation. Additionally, they observed a decreased likelihood of correct responses following TI trials and an increased error rate in eArch+ animals, even after incorrect responses, suggesting an impairment in error-corrective behavior. This contrasted with baseline sessions where no light was delivered, and both eArch+ and control animals showed low error rates.

      Weaknesses:

      While I agree with the authors that the role of hippocampal inputs to the RSC in spatial working memory is understudied and merits further investigation, I find that the optogenetic experiment, a core part of this manuscript that includes viral injections, could be improved. The effects were rather subtle, rendering some of the results barely significant and possibly too weak to support major conclusions.

      We thank Reviewer#1 for carefully and critically reading our manuscript, and for the valuable comments provided. The judged “subtlety” of the effects stems from a perspective according to which a quantitatively lower effect bears less biological significance for cognition. We disagree with this perspective and find it rather reductive for several reasons.

      Once seen in the context of the animal’s ecology, subtle impairments can be life-threatening precisely because of their subtlety, leading the animal to confidently rely on a defective capacity, for such events as remembering the habitual location of a predator, or food source.

      Also, studies in animal cognition often undertake complete, rather than graded, suppression of a given mechanism (in the same sense as that of “knocking out” a gene that is relevant for behaviour), leading to a gravelly, rather that gradually, impaired model system, to the point of not allowing a hypothetical causal link to be mechanistically revealed beyond its mere presence. This often hinders a thorough interpretation of the perturbed factor’s role. If a caricatural analogy is allowed, it would be as if we were to study the role of an animal’s legs by chopping them both off and observing the resulting behaviour.

      In our study we conclude that silencing HIPP inputs in RSC perturbs cognition enough to impair behaviour while not disabling the animal entirely, as such allowing for behaviour to proceed, and for our observation of graded, decreased (not absent), proficiency under optogenetic silencing. So rather than weak, we would say the results are statistically significant, and biologically realistic.

      Additionally, no mechanistic investigation was conducted beyond referencing previous reports to interpret the core behavioral phenotypes.

      We fully agree with this being a weakness, as we wish we could have done more mechanistic studies to find out exactly what is Arch activation doing to HIPP-RSC transmission, which neurons are being affected, and perhaps in the future dissect its circuit determinants. We have all these goals very present and hope we can address them soon.

      Reviewer #2 (Public Review):

      The authors examine the impact of optogenetic inhibition of hippocampal axon terminals in the retrosplenial cortex (RSP) during the performance of a working memory T-maze task. Performance on a delayed non-match-to-place task was impaired by such inhibition. The authors also report that inhibition is associated with faster decision-making and that the effects of inhibition can be observed over several subsequent trials. The work seems reasonably well done and the role of hippocampal projections to retrosplenial cortex in memory and decision-making is very relevant to multiple fields. However, the work should be expanded in several ways before one can make firm conclusions on the role of this projection in memory and behavior.

      We thank Reviewer#2 for carefully and critically reading our manuscript, and for the valuable comments provided.

      (1) The work is very singular in its message and the experimentation. Further, the impact of the inhibition on behaviour is very moderate. In this sense, the results do not support the conclusion that the hippocampal projection to retrosplenial cortex is key to working memory in a navigational setting.

      As we have mentioned in response to Reviewer#1, the judged “very moderate” effect stems from a perspective according to which a quantitatively lower effect bears less biological significance for cognition, precluding its consideration as “key” for behaviour. We disagree with this perspective and find it rather reductive for several reasons. Once seen in the context of the animal’s ecology, quantitatively lower impairments in working memory are no less key for this cognitive capacity, and can be life-threatening precisely because of their subtlety, leading the animal to confidently rely on a defective capacity, for such events as remembering the habitual location of a predator, or food source. Furthermore, studies in animal cognition often undertake complete, rather than graded, suppression of a given mechanism (in the same sense as “knocking out” a gene that is relevant for behaviour), leading to a gravelly, rather that gradually, impaired model system, to the point of not allowing a hypothetical causal link to be mechanistically revealed beyond its mere presence. This often hinders a thorough interpretation of its role.

      In our study we conclude that silencing HIPP inputs in RSC perturbs behaviour enough to impair behaviour while not disabling the animal entirely, as such allowing for behaviour to proceed, and our observation of graded, decreased (not absent), proficiency under optogenetic silencing. So rather than weak, we would say the results are statistically significant, and biologically realistic.

      (2) There are no experiments examining other types of behavior or working memory. Given that the animals used in the studies could be put through a large number of different tasks, this is surprising. There is no control navigational task. There is no working memory test that is non-spatial. Such results should be presented in order to put the main finding in context.

      It is hard to gainsay this point. The more thorough and complete a behavioural characterization is, the more informative is the study, from every angle you look at it. While we agree that other forms of WM would be quite interesting in this context, we also cannot ignore the fact that DNMP is widely tested as a WM task, one that is biologically plausible, sensitive to perturbations of neural circuitry know to be at play therein, and fully accepted in the field. Faced with the impossibility of running further studies, for lack of additional funding and human resources, we chose to run this task.

      A control navigational task would, in our understanding, be used to assess whether silencing HIPP projections to RSC would affect (spatial?) navigation, rather than WM, thus explaining the observed impairment. To this we have the following to say: Spatial Navigation is a very basic cognitive function, one that relies on body orientation relative to spatial context, on keeping an updated representation of such spatial context, (“alas”, as memory), and on guiding behaviour according to acquired knowledge about spatial context. Some of these functions are integral to spatial working memory, as such, they might indeed be affected.

      Dissecting the determinants of spatial WM is indeed an ongoing effort, one that was not the intention of the current study, but also one that we have very present, in hope we can address in the future.

      A non-spatial WM task would indeed vastly solidify our claims beyond spatial WM, onto WM. We have, for this reason, changed the title of the manuscript which now reads “spatial working memory”.

      (3) The actual impact of the inhibition on activity in RSP is not provided. While this may not be strictly necessary, it is relevant that the hippocampal projection to RSP includes, and is perhaps dominated by inhibitory inputs. I wonder why the authors chose to manipulate hippocampal inputs to RSP when the subiculum stands as a much stronger source of afferents to RSP and has been shown to exhibit spatial and directional tuning of activity. The points here are that we cannot be sure what the manipulation is really accomplishing in terms of inhibiting RSP activity (perhaps this explains the moderate impact on behavior) and that the effect of inhibiting hippocampal inputs is not an effective means by which to study how RSP is responsive to inputs that reflect environmental locations.

      We fully agree that neural recordings addressing the effect of silencing on RSC neural activity is relevant. We do wish we could have provided more mechanistic studies, to find out exactly what is Arch activation doing to HIPP-RSC transmission, which neurons are being affected, and thus dissecting its circuit determinants. We have all these goals very present and hope we can address them soon. Subiculum, which we mention in the Introduction, is indeed a key player in this complex circuitry, one whose hypothetical influence is the subject of experimental studies which will certainly reveal many other key elements.

      (4) The impact of inhibition on trials subsequent to the trial during which optical stimulation was actually supplied seems trivial. The authors themselves point to evidence that activation of the hyperpolarizing proton pump is rather long-lasting in its action. Further, each sample-test trial pairing is independent of the prior or subsequent trials. This finding is presented as a major finding of the work, but would normally be relegated to supplemental data as an expected outcome given the dynamics of the pump when activated.

      We disagree that this finding is “trivial”, and object to the considerations of “normalcy”, which we are left wondering about.

      In lack of neurophysiological experiments (for the reasons stated above) to address this interesting finding, we chose to interpret it in light of (the few) published observations, such being the logical course of action in scientific reporting, given the present circumstances.

      Evidence for such a prolonged effect in the context of behaviour is scarce (to our knowledge only the one we cite in the manuscript). As such, it is highly relevant to report it, and give it the relevance we do in our manuscript, rather than “relegating it to supplementary data”, as the reviewer considers being “normal”.

      In the DNMP task the consecutive sample-test pairs are explicitly not independent, as they are part of the same behavioural session. This is illustrated by the simple phenomenon of learning, namely the intra-session learning curves, and the well-known behavioral trial-history effects. The brain does not simply erase such information during the ITI.

      (5) In the middle of the first paragraph of the discussion, the authors make reference to work showing RSP responses to "contextual information in egocentric and allocentric reference frames". The citations here are clearly deficient. How is the Nitzan 2020 paper at all relevant here?

      Nitzan 2020 reports the propagation of information from HIPP to CTX via SUB and RSC, thus providing a conduit for mnemonic information between the two structures, alternative to the one we target, thus providing thorough information concerning the HIPP-RSC circuitry at play during behaviour.

      Alexander and Nitz 2015 precisely cite the encoding, and conjunction, of two types of contextual information, internal (ego-) and external (allocentric).

      The subsequent reference is indeed superfluous here.

      We thank the Reviewer#2 for calling our attention to the fact that references for this information are inadequate and lacking. We have now cited (Gill et al., 2011; Miller et al., 2019; Vedder et al., 2017) and refer readers to the review (Alexander et al., 2023)  for the purpose of illustrating the encoding of information in the two reference frames. In addition, we have substantially edited the Introduction and Discussion sections, and suppressed unnecessary passages.

      (6) The manuscript is deficient in referencing and discussing data from the Smith laboratory that is similar. The discussion reads mainly like a repeat of the results section.

      Please see above. We thank Reviewer#2 for this comment, we have now re-written the Discussion such that it is less of a summary of the Results and more focused on their implications and future directions.

      Response to recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Major

      Line 101: Even with the tapered lambda fibre optic stub, if the fibre optics were longitudinally staggered by 2 millimetres, they would deliver light to diagonal regions in the horizontal plane rather than covering the full length of the RSC. Is this staggering pattern randomized or fixed? Additionally, Figure 1C is a bit misleading, as the light distribution pattern from the tapered fibre optic is likely to be more concentrated near the surface of the fibre, rather than spreading widely in a large spherical pattern.

      The staggering is fixed. The elliptical (not spherical) contour in Fig 1C is not meant to convey any quantitative information, but rather to visually orient the reader towards the directions into which light will likely propagate, the effects of which we do not attempt to estimate here. We have made this contour smaller.

      Line 119: The authors demonstrate the viral expression pattern of a representative animal and the overall expression patterns of all other animals in Figure 1 and the Supplementary Figures. However, numerous cases in the Supplementary Figures exhibit viral leakages and strong expressions in adjacent cortical and thalamic areas. Although there is a magnified view of the RSC's expression pattern in Figure 1, authors should show the same way in the supplemental data as well. Additionally, the degree of viral expression in the hippocampal subregions varies substantially across animals. This variation is concerning and impacts the interpretation of the results.

      The viral construct was injected in the HIPP at coordinates based on our previous work (Ferreira-Fernandes et al., 2019) wherein injections of a similar vector in mid-dorsal HIPP resulted in widespread expression throughout the medial mesocortex AP extent, RSC through CG, as well as other areas in which HIPP establishes synapses. These were studied in detail then, by estimating the density of axon terminals. In the present work we did not acquire high-mag images of all slices, since they were too expensive, and we had this information from the study above. Still, we have now added further examples of high-mag images taken from eArch and CTRL animals.

      We believe it is important here to mention the fact that the virus we use, AAV5, only travels anterograde and is static (i.e. it does not travel transynaptically).

      Variations in viral expression are to be expected even if injections happen in the exact same way. It is crucial then, that fibre positioning is constant across animals, to guarantee that its relationship with viral expression is thence consistent, and to render irrelevant whatever off-target expression of the viral construct. We have ascertained this condition post-mortem in all our animals.

      Line 124: Another point regarding the viral expressions and optical fibre implants used to inhibit the HIPP-RSC pathway is that the RSC and HIPP extend substantially along the anterior-posterior axis. The authors should demonstrate how the viral expression is distributed along this axis and indicate where the tip of the tapered optical fibre ended by marking it in the histological images. This information is crucial to confirm the authors' claim that the hippocampal projection terminals were indeed modulated by optical light. Also, the manuscript would benefit from details about the power/duration and/or modulation of the light used.

      In both Figures 1 and S1 panels we can clearly see the tracks formed by the fibres. This provides examples of such dual angle placement vis a vis the expression of the construct, demonstrating that the former is fully targeted towards the latter. We have added markers to highlight these tracks and an example of a “full” track in figure S1. We did not have animals deviating from this relative positioning to any significant extent. The methods section mentions illumination power as 240mA, and we have now added estimated illumination time as well.

      Line 141: The authors should include data on task performance during learning and baseline sessions for each animal, to demonstrate that they fully grasped the task rules and that achieving a 75% performance ratio was sufficient.

      DNMP is a standard WM task used for many decades, in which animals reach performances above 75% in 4-8 sessions. We have used it extensively, and never saw any deviations from this learning rate and curve. We ran daily sessions until animals reached 75%, and thereafter until they maintained this performance, or above, for three consecutive sessions (the data points we show). We saw no deviations from what is published, nor from what is our own extensive experience, and thence are fully confident that all animals included in this manuscript grasped task rules.

      Line 146: While the study focused on inhibiting inputs during the test run (retrieval phase), it would be beneficial to also inhibit inputs during the sample run (encoding phase) and the delay period. This would help confirm whether the silencing affects only working memory retrieval, or if it also impacts encoding and maintenance.

      We agree, it would be very interesting to determine if there are any effects of silencing HIPP RSC terminals during Sample. However, since there is a limit to the number of trials per session, and to the total number of sessions, we could not run the three manipulations within each session of our experimental design, as that would lower the number of trials per condition to an extent that would affect statistical power. Silencing HIPP RSC terminals during Sample would best be a separate experiment, asking a different question, and perhaps within an experimental design distinct from the one envisioned.

      A very important point here relates to the fact that the effects of optogenetic manipulation do not limit themselves to the illumination epoch, in fact they extend far beyond onto the 3rd trial post-illumination. The insertion of Sample-illuminated trials interleaved in the same session would fundamentally affect the interpretation of experimental results, as we could not attribute lower performances to the effects in either or both manipulated epochs.

      Line 225: Figure 5 illustrates that silencing the inputs results in an extended impairment of working memory performance. However, it's unclear if there are any behavioural changes during the sample run. The inhibition could potentially affect encoding in the subsequent sample run, considering the inter-trial interval (ITI) is only 20 seconds.

      From the observation of behaviour and the analysis of our data, we saw no overt “behavioural changes during the sample run”, as latencies and speeds were essentially unchanged.

      If what is meant by your comment is the effect of optogenetic manipulation being protracted from the Test towards the Sample epoch, we find this unlikely. Conservatively, we estimate the peak of our optogenetic manipulation to occur around the time light is delivered, the Test phase, rather than 20-30 secs later.

      In theory, any effect of optogenetic silencing of HIPP terminals in RSC can cause disturbances in encoding or Sample, the ITI itself, and the epoch in which mnemonic information retrieved from the Sample epoch is confronted with the contextual information present during Test, leading to a decision. This is regardless of the illumination epoch, and even if the effect of optogenetic manipulation is not prolonged in time. 

      Since in our experiments we specifically target the Test epoch, and there is, in all likelihood, a decaying magnitude of neurophysiological effects, manifest in the reported decaying nature of the manipulation mechanism, and in our observed decrease of behavioural proficiency from subsequent trials 1:4, we are convinced that a conservative interpretation is that our major effect is concentrated in the epoch in which we deliver light - the Test epoch, the consequences of which (possibly related to short term plasticity events taking place within the HIPP-RSC neural circuit) extending further in time.

      Line 410: The methods section on the surgical procedure could be clearer, particularly regarding the coordinates for microinjection and fibre implantation. A more precise description would aid reader comprehension.

      The now-reported injection and implantation coordinates include the numbers corresponding to the distances, in mm, from Bregma to the targets, in the three stereotaxic dimensions considered: antero-posterior, medial-lateral left and right, and dorso-ventral, as well as the angle at which the fibres were positioned. We have added labels to the figures to highlight the fibreoptic track locations. We will be happy to provide further details as deemed necessary.

      Line 461: It would be helpful to know if each animal displayed a preference for the left or right side. Including a description or figure showing that the performance ratio exceeded 75% in both left and right trials would provide a more comprehensive understanding of the animals' behaviour.

      In the DNMP, an extensively used and documented WM task, it is an absolute pre-condition that no animals are biased to either side. As such, we did not use any animal that showed such bias.<br /> We have not observed this to be the case in any of our candidate animals, nor would we use any animal exhibiting such a preference.

      Minor

      Line 25: In the INTRODUCTION section, the authors introduce ego-centric and allocentric variables in the RSC. However, if they intend to discuss this feature, there is no supporting data for ego-centric or allocentric variables in the Results section.

      We agree. The extent of the discussion of ego vs allo-centric variables in our manuscript might venture a bit out of the main subject. It was included to provide wider context to our reporting of the data, considering that spatial working memory is indeed one instance in which egocentric- and allocentric-referenced cognitive mechanisms confront each other, and one in which silencing the HIPP input to a cortical region thence involved would likely disturb ensuing computations. We have now substantially edited the manuscript’s Introduction and Discussion, sections, namely toning down this aspect.

      Line 125: In the section title, DNMT -> DNMP obviously.

      We have corrected this passage.

      Figures: The quality of the figure panels does not meet the expected standards. For example, scale bars are missing in many panels (e.g., Figure 1A bottom, 1B, 1C, S1), figure labels are misaligned (as seen in Figure 3A-B compared to 3C, same with Figure 5), and there is inconsistency in color schemes (e.g., Figure 3C versus Figure 6, where 'Error' versus 'Correct' is depicted using green versus blue, respectively).

      We have now corrected these inconsistencies and mistakes.

    2. eLife assessment

      The authors report that optogenetic inhibition of hippocampal axon terminals in retrosplenial cortex impairs the performance of a delayed non-match to place task. Elucidating the role of hippocampal projections to the retrosplenial cortex in memory and decision-making behaviors is important. However, the strength of evidence for the paper's claims is incomplete.

    3. Reviewer #2 (Public Review):

      The authors examine the impact of optogenetic inhibition of hippocampal axon terminals in the retrosplenial cortex (RSP) during the performance of a working memory T-maze task. Performance on a delayed non-match-to-place task was impaired by such inhibition. The authors also report that inhibition is associated with faster decision-making and that the effects of inhibition can be observed over several subsequent trials. The work seems reasonably well done and the role of hippocampal projections to retrosplenial cortex in memory and decision-making is very relevant to multiple fields. However, the work should be expanded in several ways before one can make firm conclusions on the role of this projection in memory and behavior.

      Comments on revised version:

      The authors have provided their comments on the concerns voiced in my first review. I remain of the opinion that the experiments do not extend beyond determining whether disruption of hippocampal to retrosplenial cortex connections impacts spatial working memory. Given the restricted level of inquiry and the very moderate effect of the manipulation on memory, the work, in my opinion, does not provide significant insight into the processes of spatial working memory nor the function of the hippocampal to retrosplenial cortex connection.

    1. eLife assessment

      The paper reports the important discovery that the mouse dorsal inferior colliculus, an auditory midbrain area, encodes sound location. The evidence supporting the claims is solid, although how the encoding of sound source position in this area relates to localization behaviors in engaged mice remains unclear. The observations described should be of interest to auditory researchers studying the neural mechanisms of sound localization.

    2. Reviewer #1 (Public Review):

      Summary: In this study, the authors address whether the dorsal nucleus of the inferior colliculus (DCIC) in mice encodes sound source location within the front horizontal plane (i.e., azimuth). They do this using volumetric two-photon Ca2+ imaging and high-density silicon probes (Neuropixels) to collect single-unit data. Such recordings are beneficial because they allow large populations of simultaneous neural data to be collected. Their main results and the claims about those results are the following:

      1) DCIC single-unit responses have high trial-to-trial variability (i.e., neural noise);

      2) approximately 32% to 40% of DCIC single units have responses that are sensitive to sound source azimuth;

      3) single-trial population responses (i.e., the joint response across all sampled single units in an animal) encode sound source azimuth "effectively" (as stated in title) in that localization decoding error matches average mouse discrimination thresholds;

      4) DCIC can encode sound source azimuth in a similar format to that in the central nucleus of the inferior colliculus (as stated in Abstract);

      5) evidence of noise correlation between pairs of neurons exists;

      and 6) noise correlations between responses of neurons help reduce population decoding error.

      While simultaneous recordings are not necessary to demonstrate results #1, #2, and #4, they are necessary to demonstrate results #3, #5, and #6.

      Strengths:<br /> - Important research question to all researchers interested in sensory coding in the nervous system.<br /> - State-of-the-art data collection: volumetric two-photon Ca2+ imaging and extracellular recording using high-density probes. Large neuronal data sets.<br /> - Confirmation of imaging results (lower temporal resolution) with more traditional microelectrode results (higher temporal resolution).<br /> - Clear and appropriate explanation of surgical and electrophysiological methods. I cannot comment on the appropriateness of the imaging methods.

      Strength of evidence for claims of the study:

      1) DCIC single-unit responses have high trial-to-trial variability -<br /> The authors' data clearly shows this.

      2) Approximately 32% to 40% of DCIC single units have responses that are sensitive to sound source azimuth -<br /> The sensitivity of each neuron's response to sound source azimuth was tested with a Kruskal-Wallis test, which is appropriate since response distributions were not normal. Using this statistical test, only 8% of neurons (median for imaging data) were found to be sensitive to azimuth, and the authors noted this was not significantly different than the false positive rate. The Kruskal-Wallis test was not performed on electrophysiological data. The authors suggested that low numbers of azimuth-sensitive units resulting from the statistical analysis may be due to the combination of high neural noise and relatively low number of trials, which would reduce statistical power of the test. This may be true, but if single-unit responses were moderately or strongly sensitive to azimuth, one would expect them to pass the test even with relatively low statistical power. At best, if their statistical test missed some azimuth-sensitive units, they were likely only weakly sensitive to azimuth. The authors went on to perform a second test of azimuth sensitivity-a chi-squared test-and found 32% (imaging) and 40% (e-phys) of single units to have statistically significant sensitivity. This feels a bit like fishing for a lower p-value. The Kruskal-Wallis test should have been left as the only analysis. Moreover, the use of a chi-squared test is questionable because it is meant to be used between two categorical variables, and neural response had to be binned before applying the test.

      3) Single-trial population responses encode sound source azimuth "effectively" in that localization decoding error matches average mouse discrimination thresholds -<br /> If only one neuron in a population had responses that were sensitive to azimuth, we would expect that decoding azimuth from observation of that one neuron's response would perform better than chance. By observing the responses of more than one neuron (if more than one were sensitive to azimuth), we would expect performance to increase. The authors found that decoding from the whole population response was no better than chance. They argue (reasonably) that this is because of overfitting of the decoder model-too few trials used to fit too many parameters-and provide evidence from decoding combined with principal components analysis which suggests that overfitting is occurring. What is troubling is the performance of the decoder when using only a handful of "top-ranked" neurons (in terms of azimuth sensitivity) (Fig. 4F and G). Decoder performance seems to increase when going from one to two neurons, then decreases when going from two to three neurons, and doesn't get much better for more neurons than for one neuron alone. It seems likely there is more information about azimuth in the population response, but decoder performance is not able to capture it because spike count distributions in the decoder model are not being accurately estimated due to too few stimulus trials (14, on average). In other words, it seems likely that decoder performance is underestimating the ability of the DCIC population to encode sound source azimuth.<br /> To get a sense of how effective a neural population is at coding a particular stimulus parameter, it is useful to compare population decoder performance to psychophysical performance. Unfortunately, mouse behavioral localization data do not exist. Therefore, the authors compare decoder error to mouse left-right discrimination thresholds published previously by a different lab. However, this comparison is inappropriate because the decoder and the mice were performing different perceptual tasks. The decoder is classifying sound sources to 1 of 13 locations from left to right, whereas the mice were discriminating between left or right sources centered around zero degrees. The errors in these two tasks represent different things. The two data sets may potentially be more accurately compared by extracting information from the confusion matrices of population decoder performance. For example, when the stimulus was at -30 deg, how often did the decoder classify the stimulus to a lefthand azimuth? Likewise, when the stimulus was +30 deg, how often did the decoder classify the stimulus to a righthand azimuth?

      4) DCIC can encode sound source azimuth in a similar format to that in the central nucleus of the inferior colliculus -<br /> It is unclear what exactly the authors mean by this statement in the Abstract. There are major differences in the encoding of azimuth between the two neighboring brain areas: a large majority of neurons in the CNIC are sensitive to azimuth (and strongly so), whereas the present study shows a minority of azimuth-sensitive neurons in the DCIC. Furthermore, CNIC neurons fire reliably to sound stimuli (low neural noise), whereas the present study shows that DCIC neurons fire more erratically (high neural noise).

      5) Evidence of noise correlation between pairs of neurons exists -<br /> The authors' data and analyses seem appropriate and sufficient to justify this claim.

      6) Noise correlations between responses of neurons help reduce population decoding error -<br /> The authors show convincing analysis that performance of their decoder increased when simultaneously measured responses were tested (which include noise correlation) than when scrambled-trial responses were tested (eliminating noise correlation). This makes it seem likely that noise correlation in the responses improved decoder performance. The authors mention that the naïve Bayesian classifier was used as their decoder for computational efficiency, presumably because it assumes no noise correlation and, therefore, assumes responses of individual neurons are independent of each other across trials to the same stimulus. The use of decoder that assumes independence seems key here in testing the hypothesis that noise correlation contains information about sound source azimuth. The logic of using this decoder could be more clearly spelled out to the reader. For example, if the null hypothesis is that noise correlations do not carry azimuth information, then a decoder that assumes independence should perform the same whether population responses are simultaneous or scrambled. The authors' analysis showing a difference in performance between these two cases provides evidence against this null hypothesis.

      Minor weakness:<br /> - Most studies of neural encoding of sound source azimuth are done in a noise-free environment, but the experimental setup in the present study had substantial background noise. This complicates comparison of the azimuth tuning results in this study to those of other studies. One is left wondering if azimuth sensitivity would have been greater in the absence of background noise, particularly for the imaging data where the signal was only about 12 dB above the noise. The description of the noise level and signal + noise level in the Methods should be made clearer. Mice hear from about 2.5 - 80 kHz, so it is important to know the noise level within this band as well as specifically within the band overlapping with the signal.

    3. Reviewer #2 (Public Review):

      In the present study, Boffi et al. investigate the manner in which the dorsal cortex of the of the inferior colliculus (DCIC), an auditory midbrain area, encodes sound location azimuth in awake, passively listening mice. By employing volumetric calcium imaging (scanned temporal focusing or s-TeFo), complemented with high-density electrode electrophysiological recordings (neuropixels probes), they show that sound-evoked responses are exquisitely noisy, with only a small portion of neurons (units) exhibiting spatial sensitivity. Nevertheless, a naïve Bayesian classifier was able to predict the presented azimuth based on the responses from small populations of these spatially sensitive units. A portion of the spatial information was provided by correlated trial-to-trial response variability between individual units (noise correlations). The study presents a novel characterization of spatial auditory coding in a non-canonical structure, representing a noteworthy contribution specifically to the auditory field and generally to systems neuroscience, due to its implementation of state-of-the-art techniques in an experimentally challenging brain region. However, nuances in the calcium imaging dataset and the naïve Bayesian classifier warrant caution when interpreting some of the results.

      Strengths:<br /> The primary strength of the study lies in its methodological achievements, which allowed the authors to collect a comprehensive and novel dataset. While the DCIC is a dorsal structure, it extends up to a millimetre in depth, making it optically challenging to access in its entirety. It is also more highly myelinated and vascularised compared to e.g., the cerebral cortex, compounding the problem. The authors successfully overcame these challenges and present an impressive volumetric calcium imaging dataset. Furthermore, they corroborated this dataset with electrophysiological recordings, which produced overlapping results. This methodological combination ameliorates the natural concerns that arise from inferring neuronal activity from calcium signals alone, which are in essence an indirect measurement thereof.

      Another strength of the study is its interdisciplinary relevance. For the auditory field, it represents a significant contribution to the question of how auditory space is represented in the mammalian brain. "Space" per se is not mapped onto the basilar membrane of the cochlea and must be computed entirely within the brain. For azimuth, this requires the comparison between miniscule differences between the timing and intensity of sounds arriving at each ear. It is now generally thought that azimuth is initially encoded in two, opposing hemispheric channels, but the extent to which this initial arrangement is maintained throughout the auditory system remains an open question. The authors observe only a slight contralateral bias in their data, suggesting that sound source azimuth in the DCIC is encoded in a more nuanced manner compared to earlier processing stages of the auditory hindbrain. This is interesting, because it is also known to be an auditory structure to receive more descending inputs from the cortex.

      Systems neuroscience continues to strive for the perfection of imaging novel, less accessible brain regions. Volumetric calcium imaging is a promising emerging technique, allowing the simultaneous measurement of large populations of neurons in three dimensions. But this necessitates corroboration with other methods, such as electrophysiological recordings, which the authors achieve. The dataset moreover highlights the distinctive characteristics of neuronal auditory representations in the brain. Its signals can be exceptionally sparse and noisy, which provide an additional layer of complexity in the processing and analysis of such datasets. This will be undoubtedly useful for future studies of other less accessible structures with sparse responsiveness.

      Weaknesses:<br /> Although the primary finding that small populations of neurons carry enough spatial information for a naïve Bayesian classifier to reasonably decode the presented stimulus is not called into question, certain idiosyncrasies, in particular the calcium imaging dataset and model, complicate specific interpretations of the model output, and the readership is urged to interpret these aspects of the study's conclusions with caution.

      I remain in favour of volumetric calcium imaging as a suitable technique for the study, but the presently constrained spatial resolution is insufficient to unequivocally identify regions of interest as cell bodies (and are instead referred to as "units" akin to those of electrophysiological recordings). It remains possible that the imaging set is inadvertently influenced by non-somatic structures (including neuropil), which could report neuronal activity differently than cell bodies. Due to the lack of a comprehensive ground-truth comparison in this regard (which to my knowledge is impossible to achieve with current technology), it is difficult to imagine how many informative such units might have been missed because their signals were influenced by spurious, non-somatic signals, which could have subsequently misled the models. The authors reference the original Nature Methods article (Prevedel et al., 2016) throughout the manuscript, presumably in order to avoid having to repeat previously published experimental metrics. But the DCIC is neither the cortex nor hippocampus (for which the method was originally developed) and may not have the same light scattering properties (not to mention neuronal noise levels). Although the corroborative electrophysiology data largely eleviates these concerns for this particular study, the readership should be cognisant of such caveats, in particular those who are interested in implementing the technique for their own research.

      A related technical limitation of the calcium imaging dataset is the relatively low number of trials (14) given the inherently high level of noise (both neuronal and imaging). Volumetric calcium imaging, while offering a uniquely expansive field of view, requires relatively high average excitation laser power (in this case nearly 200 mW), a level of exposure the authors may have wanted to minimise by maintaining a low the number of repetitions, but I yield to them to explain. Calcium imaging is also inherently slow, requiring relatively long inter-stimulus intervals (in this case 5 s). This unfortunately renders any model designed to predict a stimulus (in this case sound azimuth) from particularly noisy population neuronal data like these as highly prone to overfitting, to which the authors correctly admit after a model trained on the entire raw dataset failed to perform significantly above chance level. This prompted them to feed the model only with data from neurons with the highest spatial sensitivity. This ultimately produced reasonable performance (and was implemented throughout the rest of the study), but it remains possible that if the model was fed with more repetitions of imaging data, its performance would have been more stable across the number of units used to train it. (All models trained with imaging data eventually failed to converge.) However, I also see these limitations as an opportunity to improve the technology further, which I reiterate will be generally important for volume imaging of other sparse or noisy calcium signals in the brain.

      Transitioning to the naïve Bayesian classifier itself, I first openly ask the authors to justify their choice of this specific model. There are countless types of classifiers for these data, each with their own pros and cons. Did they actually try other models (such as support vector machines), which ultimately failed? If so, these negative results (even if mentioned en passant) would be extremely valuable to the community, in my view. I ask this specifically because different methods assume correspondingly different statistical properties of the input data, and to my knowledge naïve Bayesian classifiers assume that predictors (neuronal responses) are assumed to be independent within a class (azimuth). As the authors show that noise correlations are informative in predicting azimuth, I wonder why they chose a model that doesn't take advantage of these statistical regularities. It could be because of technical considerations (they mention computing efficiency), but I am left generally uncertain about the specific logic that was used to guide the authors through their analytical journey.

      That aside, there remain other peculiarities in model performance that warrant further investigation. For example, what spurious features (or lack of informative features) in these additional units prevented the models of imaging data from converging? In an orthogonal question, did the most spatially sensitive units share any detectable tuning features? A different model trained with electrophysiology data in contrast did not collapse in the range of top-ranked units plotted. Did this model collapse at some point after adding enough units, and how well did that correlate with the model for the imaging data? How well did the form (and diversity) of the spatial tuning functions as recorded with electrophysiology resemble their calcium imaging counterparts? These fundamental questions could be addressed with more basic, but transparent analyses of the data (e.g., the diversity of spatial tuning functions of their recorded units across the population). Even if the model extracts features that are not obvious to the human eye in traditional visualisations, I would still find this interesting.

      Finally, the readership is encouraged to interpret certain statements by the authors in the current version conservatively. How the brain ultimately extracts spatial neuronal data for perception is anyone's guess, but it is important to remember that this study only shows that a naïve Bayesian classifier could decode this information, and it remains entirely unclear whether the brain does this as well. For example, the model is able to achieve a prediction error that corresponds to the psychophysical threshold in mice performing a discrimination task (~30 {degree sign}). Although this is an interesting coincidental observation, it does not mean that the two metrics are necessarily related. The authors correctly do not explicitly claim this, but the manner in which the prose flows may lead a non-expert into drawing that conclusion. Moreover, the concept of redundancy (of spatial information carried by units throughout the DCIC) is difficult for me to disentangle. One interpretation of this formulation could be that there are non-overlapping populations of neurons distributed across the DCIC that each could predict azimuth independently of each other, which is unlikely what the authors meant. If the authors meant generally that multiple neurons in the DCIC carry sufficient spatial information, then a single neuron would have been able to predict sound source azimuth, which was not the case. I have the feeling that they actually mean "complimentary", but I leave it to the authors to clarify my confusion, should they wish.

      In summary, the present study represents a significant body of work that contributes substantially to the field of spatial auditory coding and systems neuroscience. However, limitations of the imaging dataset and model as applied in the study muddles concrete conclusions about how the DCIC precisely encodes sound source azimuth and even more so to sound localisation in a behaving animal. Nevertheless, it presents a novel and unique dataset, which, regardless of secondary interpretation, corroborates the general notion that auditory space is encoded in an extraordinarily complex manner in the mammalian brain.

    4. Reviewer #3 (Public Review):

      Summary: Boffi and colleagues sought to quantify the single-trial, azimuthal information in the dorsal cortex of the inferior colliculus (DCIC), a relatively understudied subnucleus of the auditory midbrain. They used two complementary recording methods while mice passively listened to sounds at different locations: a large volume but slow sampling calcium-imaging method, and a smaller volume but temporally precise electrophysiology method. They found that neurons in the DCIC were variable in their activity, unreliably responding to sound presentation and responding during inter-sound intervals. Boffi and colleagues used a naïve Bayesian decoder to determine if the DCIC population encoded sound location on a single trial. The decoder failed to classify sound location better than chance when using the raw single-trial population response but performed significantly better than chance when using intermediate principal components of the population response. In line with this, when the most azimuth dependent neurons were used to decode azimuthal position, the decoder performed equivalently to the azimuthal localization abilities of mice. The top azimuthal units were not clustered in the DCIC, possessed a contralateral bias in response, and were correlated in their variability (e.g., positive noise correlations). Interestingly, when these noise correlations were perturbed by inter-trial shuffling decoding performance decreased. Although Boffi and colleagues display that azimuthal information can be extracted from DCIC responses, it remains unclear to what degree this information is used and what role noise correlations play in azimuthal encoding.

      Strengths: The authors should be commended for collection of this dataset. When done in isolation (which is typical), calcium imaging and linear array recordings have intrinsic weaknesses. However, those weaknesses are alleviated when done in conjunction with one another - especially when the data largely recapitulates the findings of the other recording methodology. In addition to the video of the head during the calcium imaging, this data set is extremely rich and will be of use to those interested in the information available in the DCIC, an understudied but likely important subnucleus in the auditory midbrain.

      The DCIC neural responses are complex; the units unreliably respond to sound onset, and at the very least respond to some unknown input or internal state (e.g., large inter-sound interval responses). The authors do a decent job in wrangling these complex responses: using interpretable decoders to extract information available from population responses.

      Weaknesses:<br /> The authors observe that neurons with the most azimuthal sensitivity within the DCIC are positively correlated, but they use a Naïve Bayesian decoder which assume independence between units. Although this is a bit strange given their observation that some of the recorded units are correlated, it is unlikely to be a critical flaw. At one point the authors reduce the dimensionality of their data through PCA and use the loadings onto these components in their decoder. PCA incorporates the correlational structure when finding the principal components and constrains these components to be orthogonal and uncorrelated. This should alleviate some of the concern regarding the use of the naïve Bayesian decoder because the projections onto the different components are independent. Nevertheless, the decoding results are a bit strange, likely because there is not much linearly decodable azimuth information in the DCIC responses. Raw population responses failed to provide sufficient information concerning azimuth for the decoder to perform better than chance. Additionally, it only performed better than chance when certain principal components or top ranked units contributed to the decoder but not as more components or units were added. So, although there does appear to be some azimuthal information in the recoded DCIC populations - it is somewhat difficult to extract and likely not an 'effective' encoding of sound localization as their title suggests.

      Although this is quite a worthwhile dataset, the authors present relatively little about the characteristics of the units they've recorded. This may be due to the high variance in responses seen in their population. Nevertheless, the authors note that units do not respond on every trial but do not report what percent of trials that fail to evoke a response. Is it that neurons are noisy because they do not respond on every trial or is it also that when they do respond they have variable response distributions? It would be nice to gain some insight into the heterogeneity of the responses. Additionally, is there any clustering at all in response profiles or is each neuron they recorded in the DCIC unique? They also only report the noise correlations for their top ranked units, but it is possible that the noise correlations in the rest of the population are different. It would also be worth digging into the noise correlations more - are units positively correlated because they respond together (e.g., if unit x responds on trial 1 so does unit y) or are they also modulated around their mean rates on similar trials (e.g., unit x and y respond and both are responding more than their mean response rate). A large portion of trial with no response can occlude noise correlations. More transparency around the response properties of these populations would be welcome.

      It is largely unclear what the DCIC is encoding. Although the authors are interested in azimuth, sound location seems to be only a small part of DCIC responses. The authors report responses during inter-sound interval and unreliable sound-evoked responses. Although they have video of the head during recording, we only see a correlation to snout and ear movements (which are peculiar since in the example shown it seems the head movements predict the sound presentation). Additional correlates could be eye movements or pupil size. Eye movement are of particular interest due to their known interaction with IC responses - especially if the DCIC encodes sound location in relation to eye position instead of head position (though much of eye-position-IC work was done in primates and not rodent). Alternatively, much of the population may only encode sound location if an animal is engaged in a localization task. Ideally, the authors could perform more substantive analyses to determine if this population is truly noisy or if the DCIC is integrating un-analyzed signals.

      Although this critique is ubiquitous among decoding papers in the absence of behavioral or causal perturbations, it is unclear what - if any - role the decoded information may play in neuronal computations. The interpretation of the decoder means that there is some extractable information concerning sound azimuth - but not if it is functional. This information may just be epiphenomenal, leaking in from inputs, and not used in computation or relayed to downstream structures. This should be kept in mind when the authors suggest their findings implicate the DCIC functionally in sound localization.

      It is unclear why positive noise correlations amongst similarly tuned neurons would improve decoding. A toy model exploring how positive noise correlations in conjunction with unreliable units that inconsistently respond may anchor these findings in an interpretable way. It seems plausible that inconsistent responses would benefit from strong noise correlations, simply by units responding together. This would predict that shuffling would impair performance because you would then be sampling from trials in which some units respond, and trials in which some units do not respond - and may predict a bimodal performance distribution in which some trials decode well (when the units respond) and poor performance (when the units do not respond).

      Significance: Boffi and colleagues set out to parse the azimuthal information available in the DCIC on a single trial. They largely accomplish this goal and are able to extract this information when allowing the units that contain more information about sound location to contribute to their decoding (e.g., through PCA or decoding on top unit activity specifically). The dataset will be of value to those interested in the DCIC and also to anyone interested in the role of noise correlations in population coding. Although this work is first step into parsing the information available in the DCIC, it remains difficult to interpret if/how this azimuthal information is used in localization behaviors of engaged mice.

    1. eLife assessment

      This valuable study provides convincing evidence that mutant hair cells with abnormal, reversed polarity of their hair bundles in mouse otolith organs retain wild-type localization, mechanoelectrical transduction and receptor field of their afferent innervation, leading to mild behavioral dysfunction. It thus demonstrates that the bimodal pattern of afferent nerve projections in this organ is not causally related to the bimodal distribution of hair-bundle orientations, as also confirmed in the zebrafish lateral line. The work will be of interest to scientists interested in the development and function of the vestibular system as well as in planar-cell polarity.

    2. Reviewer #1 (Public Review):

      Summary:

      The authors aim at dissecting the relationship between hair-cell directional mechanosensation and orientation-linked synaptic selectivity, using mice and the zebrafish. They find that Gpr156 mutant animals homogenize the orientation of hair cells without affecting the selectivity of afferent neurons, suggesting that hair-cell orientation is not the feature that determines synaptic selectivity. Therefore, the process of Emx2-dependent synaptic selectivity bifurcates downstream of Gpr156.

      Strengths:

      This is an interesting and solid paper. It solves an interesting problem and establishes a framework for the following studies. That is, to ask what are the putative targets of Emx2 that affect synaptic selectivity.<br /> The quality of the data is generally excellent.

      Weaknesses:

      The feeling is that the advance derived from the results is very limited.

    3. Reviewer #2 (Public Review):

      Summary:

      The authors inquire in particular whether the receptor Gpr156, which is necessary for hair cells to reverse their polarities in the zebrafish lateral line and mammalian otolith organs downstream of the differential expression of the transcription factor Emx2, also controls the mechanosensitive properties of hair cells and ultimately an animal's behavior. This study thoroughly addresses the issue by analyzing the morphology, electrophysiological responses, and afferent connections of hair cells found in different regions of the mammalian utricle and the Ca2+ responses of lateral line neuromasts in both wild-type animals and gpr156 mutants. Although many features of hair cell function are preserved in the mutants-such as development of the mechanosensory organs and the Emx2-dependent, polarity-specific afferent wiring and synaptic pairing-there are a few key changes. In the zebrafish neuromast, the magnitude of responses of all hair cells to water flow resembles that of the wild-type hair cells that respond to flow arriving from the tail. These responses are larger than those observed in hair cells that are sensitive to flow arriving from the head and resemble effects previously observed in Emx2 mutants. The authors note that this behavior suggests that the Emx2-GPR156 signaling axis also impinges on hair cell mechanotransduction. Although mutant mice exhibit normal posture and balance, they display defects in swimming behavior. Moreover, their vestibulo-ocular reflexes are perturbed. The authors note that the gpr156 mutant is a good model to study the role of opposing hair cell polarity in the vestibular system, for the wiring patterns follow the expression patterns of Emx2, even though hair cells are all of the same polarity. This paper excels at describing the effects of gpr156 perturbation in mouse and zebrafish models and will be of interest to those studying the vestibular system, hair cell polarity, and the role of inner-ear organs in animal behavior.

      Strengths:

      The study is exceptional in including, not only morphological and immunohistochemical indices of cellular identity but also electrophysiological properties. The mutant hair cells of murine maculæ display essentially normal mechanoelectrical transduction and adaptation-with two or even three kinetic components-as well as normal voltage-activated ionic currents.

    1. eLife assessment:

      This important study investigates the contribution of cytosolic S100A/8 to neutrophil migration to inflamed tissues. The authors provide convincing evidence for how the loss of cytosolic S100A/8 specifically affects the ability of neutrophils to crawl and subsequently adhere under shear stress. This study will be of interest in fields where inflammation is implicated, such as autoimmunity or sepsis.

    2. Reviewer #1 (Public Review):

      Summary:

      In this manuscript by Napoli et al, the authors study the intracellular function of Cytosolic S100A8/A9 a myeloid cell soluble protein that operates extracellularly as an alarmin, whose intracellular function is not well characterized. Here, the authors utilize state-of-the-art intravital microscopy to demonstrate that adhesion defects observed in cells lacking S100A8/A9 (Mrp14-/-) are not rescued by exogenous S100A8/A9, thus highlighting an intrinsic defect. Based on this result subsequent efforts were employed to characterize the nature of those adhesion defects.

      Strengths:

      The authors convincingly show that Mrp14-/- neutrophils have normal rolling but defective adhesion caused by impaired CD11b activation (deficient ICAM1 binding). Analysis of cellular spreading (defective in Mrp14-/- cells) is also sound. The manuscript then focuses on selective signaling pathways and calcium measurements. Overall, this is a straightforward study of biologically important proteins and mechanisms.

      Weaknesses:

      Some suggestions are included below to improve this manuscript.

    3. Reviewer #2 (Public Review):

      Summary:

      Napoli et al. provide a compelling study showing the importance of cytosolic S100A8/9 in maintaining calcium levels at LFA-1 nanoclusters at the cell membrane, thus allowing the successful crawling and adherence of neutrophils under shear stress. The authors show that cytosolic S100A8/9 is responsible for retaining stable and high concentrations of calcium specifically at LFA-1 nanoclusters upon binding to ICAM-1, and imply that this process aids in facilitating actin polymerisation involved in cell shape and adherence. The authors show early on that S100A8/9 deficient neutrophils fail to extravasate successfully into the tissue, thus suggesting that targeting cytosolic S100A8/9 could be useful in settings of autoimmunity/acute inflammation where neutrophil-induced collateral damage is unwanted.

      Strengths:

      Using multiple complementary methods from imaging to western blotting and flow cytometry, including extracellular supplementation of S100A8/9 in vivo, the authors conclusively prove a defect in intracellular S100A8/9, rather than extracellular S100A8/9 was responsible for the loss in neutrophil adherence, and pinpointed that S100A8/9 aided in calcium stabilisation and retention at the plasma membrane.

      Weaknesses:

      (1) Extravasation is shown to be a major defect of Mrp14-/- neutrophils, but the Giemsa staining in Figure 1H seems to be quite unspecific to me, as neutrophils were determined by nuclear shape and granularity. It would have perhaps been more clear to use immunofluorescence staining for neutrophils instead as seen in Supplementary Figure 1A (staining for Ly6G or other markers instead of S100A9).

      (2) The representative image for Mrp14-/- neutrophils used in Figure 4K to demonstrate Ripley's K function seems to be very different from that shown above in Figures 4C and 4F.

      (3) Although the authors have done well to draw a path linking cytosolic S100A8/9 to actin polymerisation and subsequently the arrest and adherence of neutrophils in vitro, the authors can be more explicit with the analysis - for example, is the F-actin co-localized with the LFA-1 nanoclusters? Does S100A8/9 localise to the membrane with LFA-1 upon stimulation? Lastly, I think it would have been very useful to close the loop on the extravasation observation with some in vitro evidence to show that neutrophils fail to extravasate under shear stress.

    1. eLife assessment

      This study presents an important finding on the influence of visual uncertainty and Bayesian cue combination on implicit motor adaptation in young healthy participants, hereby linking perception and action during implicit adaptation. The evidence supporting the claims of the authors is convincing. The normative approach of the proposed PEA model, which combines ideas from separate lines of research, including vision research and motor learning, opens avenues for future developments. This work will be of interest to researchers in sensory cue integration and motor learning.

    2. Reviewer #1 (Public Review):

      I appreciate the normative approach of the PEA model and am eager to examine this model in the future. However, two minor issues remain:

      (1) Clarification on the PReMo Model:

      The authors state, "The PReMo model proposes that this drift comprises two phases: initial proprioceptive recalibration and subsequent visual recalibration." This description could misinterpret the intent of PReMo. According to PReMo, the time course of the reported hand position is merely a read-out of the *perceived hand position* (x_hat in your paper). Early in adaptation, the perceived hand position is biased by the visual cursor (x_hat in the direction of the cursor); towards the end, due to implicit adaptation, x_hat reduces to zero. This is the same as PEA. I recommend that the authors clarify PReMo's intent to avoid confusion.

      Note, however, the observed overshoot of 1 degree in the reported hand position. In the PReMo paper, we hypothesized that this effect is due to the recalibration of the perceived visual target location (inspired by studies showing that vision is also recalibrated by proprioception, but in the opposite direction). If the goal of implicit adaptation is to align the perceived hand position (x_hat) with the perceived target position (t_hat), then there would be an overshoot of x_hat over the actual target position.

      PEA posits a different account for the overshoot. It currently suggests that the reported hand position combines x_hat (which takes x_p as input) with x_p itself. What is reasoning underlying the *double occurrence* of x_p?

      There seem to be three alternatives that seem more plausible (and could lead to the same overshooting): 1) increasing x_p's contribution (assuming visual uncertainty increases when the visual cursor is absent during the hand report phase), 2) decreasing sigma_p (assuming that participants pay more attention to the hand during the report phase), 3) it could be that the perceived target position undergoes recalibration in the opposite direction to proprioceptive recalibration. All these options, at least to me, seem equally plausible and testable in the future.

      (2) Effect of Visual Uncertainty on Error Size:

      I appreciate the authors' response about methodological differences between the cursor cloud used in previous studies and the Gaussian blob used in the current study. However, it is still not clear to me how the authors reconcile previous studies showing that visual uncertainty reduced implicit adaptation for small but not large errors (Tsay et al, 2021; Makino, et al 2023) with the current findings, where visual uncertainty reduced implicit adaptation for large but not small errors.

      Could the authors connect the dots here: I could see that the cursor cloud increases potential overlap with the visual target when the visual error is small, resulting in intrinsic reward-like mechanisms (Kim et al, 2019), which could potentially explain attenuated implicit adaptation for small visual errors. However, why would implicit adaptation in response to large visual errors remain unaffected by the cursor cloud? Note that we did verify that sigma_v is increased in (Tsay et al. 2021), so it is unlikely due to the cloud simply failing as a manipulation of visual uncertainty.

      In addition, we also reasoned that testing individuals with low vision could offer a different test of visual uncertainty (Tsay et al, 2023). The advantage here is that both control and patients with low vision are provided with the same visual input-a single cursor. Our findings suggest that uncertainty due to low vision also shows reduced implicit adaptation in response to small but not large errors, contrary to the findings in the current paper. Missing in the manuscript is a discussion related to why the authors' current findings contradict those of previous results.

    3. Reviewer #2 (Public Review):

      Summary:

      The authors present the Perceptual Error Adaptation (PEA) model, a computational approach offering a unified explanation for behavioral results that are inconsistent with standard state-space models. Beginning with the conventional state-space framework, the paper introduces two innovative concepts. Firstly, errors are calculated based on the perceived hand position, determined through Bayesian integration of visual, proprioceptive, and predictive cues. Secondly, the model accounts for the eccentricity of vision, proposing that the uncertainty of cursor position increases with distance from the fixation point. This elegantly simple model, with minimal free parameters, effectively explains the observed plateau in motor adaptation under the implicit motor adaptation paradigm using the error-clamp method. Furthermore, the authors experimentally manipulate visual cursor uncertainty, a method established in visuomotor studies, to provide causal evidence. Their results show that the adaptation rate correlates with perturbation sizes and visual noise, uniquely explained by the PEA model and not by previous models. Therefore, the study convincingly demonstrates that implicit motor adaptation is a process of Bayesian cue integration

      Strengths:

      In the past decade, numerous perplexing results in visuomotor rotation tasks have questioned their underlying mechanisms. Prior models have individually addressed aspects like aiming strategies, motor adaptation plateaus, and sensory recalibration effects. However, a unified model encapsulating these phenomena with a simple computational principle was lacking. This paper addresses this gap with a robust Bayesian integration-based model. Its strength lies in two fundamental assumptions: motor adaptation's influence by visual eccentricity, a well-established vision science concept, and sensory estimation through Bayesian integration. By merging these well-founded principles, the authors elucidate previously incongruent and diverse results with an error-based update model. The incorporation of cursor feedback noise manipulation provides causal evidence for their model. The use of eye-tracking in their experimental design, and the analysis of adaptation studies based on estimated eccentricity, are particularly elegant. This paper makes a significant contribution to visuomotor learning research.

      The authors discussed in the revised version that the proposed model can capture the general implicit motor learning process in addition to the visuomotor rotation task. In the discussion, they emphasize two main principles: the automatic tracking of effector position and the combination of movement cues using Bayesian integration. These principles are suggested as key to understanding and modeling various motor adaptations and skill learning. The proposed model could potentially become a basis for creating new computational models for skill acquisition, especially where current models fall short.

      Weaknesses:

      The proposed model is described as elegant. In this paper, the authors test the model within a limited example condition, demonstrating its relevance to the sensorimotor adaptation mechanisms of the human brain. However, the scope of the model's applicability remains unclear. It has shown the capacity to explain prior data, thereby surpassing previous models that rely on elementary mathematics. To solidify its credibility in the field, the authors must gather more supporting evidence.

    4. Reviewer #3 (Public Review):

      (2.1) Summary

      In this paper, the authors model motor adaptation as a Bayesian process that combines visual uncertainty about the error feedback, uncertainty about proprioceptive sense of hand position, and uncertainty of predicted (=planned) hand movement with a learning and retention rate as used in state space models. The model is built with results from several experiments presented in the paper and is compared with the PReMo model (Tsay, Kim et al., 2022) as well as a cue combination model (Wei & Körding, 2009). The model and experiments demonstrate the role of visual uncertainty about error feedback in implicit adaptation.

      In the introduction, the authors notice that implicit adaptation (as measured in error-clamp based paradigms) does not saturate at larger perturbations, but decreases again (e.g. Moorehead et al., 2017 shows no adaptation at 135{degree sign} and 175{degree sign} perturbations). They hypothesized that visual uncertainty about cursor position increases with larger perturbations since the cursor is further from the fixated target. This could decrease importance assigned to visual feedback which could explain lower asymptotes.

      The authors characterize visual uncertainty for 3 rotation sizes in a first experiment, and while this experiment could be improved, it is probably sufficient for the current purposes. Then the authors present a second experiment where adaptation to 7 clamped errors are tested in different groups of participants. The models' visual uncertainty is set using a linear fit to the results from experiment 1, and the remaining 4 parameters are then fit to this second data set. The 4 parameters are 1) proprioceptive uncertainty, 2) uncertainty about the predicted hand position, 3) a learning rate and 4) a retention rate. The authors' Perceptual Error Adaptation model ("PEA") predicts asymptotic levels of implicit adaptation much better than both the PReMo model (Tsay, Kim et al., 2022), which predicts saturated asymptotes, or a causal inference model (Wei & Körding, 2007) which predicts no adaptation for larger rotations. In a third experiment, the authors test their model's predictions about proprioceptive recalibration, but unfortunately compare their data with an unsuitable other data set (Tsay et al. 2020, instead of Tsay et al. 2021). Finally, the authors conduct a fourth experiment where they put their model to the test. They measure implicit adaptation with increased visual uncertainty, by adding blur to the cursor, and the results are again better in line with their model (predicting overall lower adaptation), than with the PReMo model (predicting equal saturation but at larger perturbations) or a causal inference model (predicting equal peak adaptation, but shifted to larger rotations). In particular the model fits for experiment 2 and the results from experiment 4 show that the core idea of the model has merit: increased visual uncertainty about errors dampens implicit adaptation.

      (2.2) Strengths

      In this study the authors propose a Perceptual Error Adaptation model ("PEA") and the work combines various ideas from the field of cue combination, Bayesian methods and new data sets, collected in four experiments using various techniques that test very different components of the model. The central component of visual uncertainty is assessed in a first experiment. The model uses 4 other parameters to explain implicit adaptation. These parameters are: 1) a learning and 2) a retention rate, as used in popular state space models and the uncertainty (variance) of 3) predicted and 4) proprioceptive hand position. In particular, the authors observe that asymptotes for implicit learning do not saturate, as claimed before, but decrease again when rotations are very large and that this may have to do with visual uncertainty (e.g. Tsay et al., 2021, J Neurophysiol 125, 12-22). The final experiment confirms predictions of the fitted model about what happens when visual uncertainty is increased (overall decrease of adaptation). By incorporating visual uncertainty depending on retinal eccentricity, the predictions of the PEA model for very large perturbations are notably different from, and better than, the predictions of the two other models it is compared to. That is, the paper provides strong support for the idea that visual uncertainty of errors matters for implicit adaptation.

      (2.3) Weaknesses

      Although the authors don't say this, the "concave" function that shows that adaptation does not saturate for larger rotations has been shown before, including in papers cited in this manuscript.

      The first experiment, measuring visual uncertainty for several rotation sizes in error-clamped paradigms has several shortcomings, but these might not be so large as to invalidate the model or the findings in the rest of the manuscript. There are two main issues we highlight here. First, the data is not presented in units that allow comparison with vision science literature. Second, the 1 second delay between movement endpoint and disappearance of the cursor, and the presentation of the reference marker, may have led to substantial degradation of the visual memory of the cursor endpoint. That is, the experiment could be overestimating the visual uncertainty during implicit adaptation.

      The paper's third experiment relies to a large degree on reproducing patterns found in one particular paper, where the reported hand positions - as a measure of proprioceptive sense of hand position - are given and plotted relative to an ever present visual target, rather than relative to the actual hand position. That is, 1) since participants actively move to a visual target, the reported hand positions do not reflect proprioception, but mostly the remembered position of the target participants were trying to move to, and 2) if the reports are converted to a difference between the real and reported hand position (rather than the difference between the target and the report), those would be on the order of ~20{degree sign} which is roughly two times larger than any previously reported proprioceptive recalibration, and an order of magnitude larger than what the authors themselves find (1-2{degree sign}) and what their model predicts. Experiment 3 is perhaps not crucial to the paper, but it nicely provides support for the idea that proprioceptive recalibration can occur with error-clamped feedback.

      Perhaps the largest caveat to the study is that it assumes that people do not look at the only error feedback available to them (and can explicitly suppress learning from it). This was probably true in the experiments used in the manuscript, but unlikely to be the case in most of the cited literature. Ignoring errors and suppressing adaptation would also be a disastrous strategy to use in the real world, such that our brains may not be very good at this. So the question remains to what degree - if any - the ideas behind the model generalize to experiments without fixation control, and more importantly, to real life situations.

    5. Author response:

      The following is the authors’ response to the current reviews.

      eLife assessment

      This study presents an important finding on the influence of visual uncertainty and Bayesian cue combination on implicit motor adaptation in young healthy participants, hereby linking perception and action during implicit adaptation. The evidence supporting the claims of the authors is convincing. The normative approach of the proposed PEA model, which combines ideas from separate lines of research, including vision research and motor learning, opens avenues for future developments. This work will be of interest to researchers in sensory cue integration and motor learning.

      Thank you for the updated assessment. We are also grateful for the insightful and constructive comments from the reviewers, which have helped us improve the manuscript again. We made necessary changes following their comments (trimmed tests, new analysis results, etc) and responded to the comments in a point-by-point fashion below. We hope to publish these responses alongside the public review. Thank you again for fostering the fruitful discussion here.

      Public Reviews:

      Reviewer #1 (Public Review):

      I appreciate the normative approach of the PEA model and am eager to examine this model in the future. However, two minor issues remain:

      (1) Clarification on the PReMo Model:

      The authors state, "The PReMo model proposes that this drift comprises two phases: initial proprioceptive recalibration and subsequent visual recalibration." This description could misinterpret the intent of PReMo. According to PReMo, the time course of the reported hand position is merely a read-out of the *perceived hand position* (x_hat in your paper). Early in adaptation, the perceived hand position is biased by the visual cursor (x_hat in the direction of the cursor); towards the end, due to implicit adaptation, x_hat reduces to zero. This is the same as PEA. I recommend that the authors clarify PReMo's intent to avoid confusion.

      Note, however, the observed overshoot of 1 degree in the reported hand position. In the PReMo paper, we hypothesized that this effect is due to the recalibration of the perceived visual target location (inspired by studies showing that vision is also recalibrated by proprioception, but in the opposite direction). If the goal of implicit adaptation is to align the perceived hand position (x_hat) with the perceived target position (t_hat), then there would be an overshoot of x_hat over the actual target position.

      PEA posits a different account for the overshoot. It currently suggests that the reported hand position combines x_hat (which takes x_p as input) with x_p itself. What is reasoning underlying the *double occurrence* of x_p?

      There seem to be three alternatives that seem more plausible (and could lead to the same overshooting): 1) increasing x_p's contribution (assuming visual uncertainty increases when the visual cursor is absent during the hand report phase), 2) decreasing sigma_p (assuming that participants pay more attention to the hand during the report phase), 3) it could be that the perceived target position undergoes recalibration in the opposite direction to proprioceptive recalibration. All these options, at least to me, seem equally plausible and testable in the future.

      For clarification of the PReMo model’s take on Fig4A, we now write:

      “The PReMo model proposes that the initial negative drift reflects a misperceived hand location, which gradually reduces to zero, and the late positive drift reflects the influence of visual calibration of the target (Tsay, Kim, Saxena, et al., 2022). ”

      However, we would like to point out that the PEA model does not predict a zero (perceived hand location) even at the late phase of adaptation: it remains negative, though not as large as during initial adaptation (see Figure 4A, red line). Furthermore, we have not seen any plausible way to use a visually biased target to explain the overshoot of the judged hand location (see below when we address the three alternative hypotheses the reviewer raised).

      We don’t think the “double” use of xp is a problem, simply because there are TWO tasks under investigation when the proprioceptive changes are measured along with adaptation. The first is the reaching adaptation task itself: moving under the influence of the clamped cursor. This task is accompanied by a covert estimation of hand location after the movement (). Given the robustness of implicit adaptation, this estimation appears mandatory and automatic. The second task is the hand localization task, during which the subject is explicitly asked to judge where the hand is. Here, the perceived hand is based on the two available cues, one is the actual hand location xp, and the other is the influence from the just finished reaching movement (i.e., ). For Bayesian modeling from a normative perspective, sensory integration is based on the available cues to fulfill the task. For the second task of reporting the hand location, the two cues are xp and (with a possible effect of the visual target, which is unbiased since it is defined as 0 in model simulation; thus, its presence does not induce any shift effect). xp is used sequentially in this sense. Thus, its dual use is well justified.

      Our hypothesis is that the reported hand position results from a combination of from the previous movement and the current hand position xp. However, specifically for the overshoot of the judged hand location in the late part of the adaptation (Fig4A), the reviewer raised three alternative explanations by assuming that the PReMo model is correct. Under the PReMo model, the estimated hand location is only determined by , and xp is not used in the hand location report phase. In addition, (with xp used once) and a visual recalibration of the target can explain away the gradual shift from negative to positive (overshoot).

      We don’t think any of them can parsimoniously explain our findings here, and we go through these three hypotheses one by one:

      (1) increasing xp's contribution (assuming visual uncertainty increases when the visual cursor is absent during the hand report phase)

      (2) decreasing σp (assuming that participants pay more attention to the hand during the report phase)

      The first two alternative explanations basically assume that xp has a larger contribution (weighting in Bayesian terms) in the hand location report phase than in the adaptation movement phase, no matter due to an increase in visual uncertainty (alternative explanation 1) or a reduction in proprioceptive uncertainty (alternative explanation 2). Thus, we assume that the reviewer suggests that a larger weight for xp can explain why the perceived hand location changes gradually from negative to positive. However, per the PReMo model, a larger weight for the xp will only affect , which is already assumed to change from negative to zero. More weight in  in the hand report phase (compared to the adaptation movement phase) would not explain away the reported hand location from negative to positive. This is because no matter how much weight the xp has, the PReMo model assumes a saturation for the influence of xp on . Thus would not exceed zero in the late adaptation. Then, the PReMo model would rely on the so-called visual shift of the target to explain the overshoot. This leads us to the third alternative the reviewer raised:

      (3) it could be that the perceived target position undergoes recalibration in the opposite direction to proprioceptive recalibration.

      The PReMo model originally assumed that the perceived target location was biased in order to explain away the positive overshoot of the reported hand location. We assume that the reviewer suggests that the perceived target position, which is shifted to the positive direction, also “biases” the perceived hand position. We also assume that the reviewer suggests that the perceived hand location after a clamp trial () is zero, and somehow the shifted perceived target position “biases” the reported hand location after a clamp trial. Unfortunately, we did not see any mathematical formulation of this biasing effect in the original paper (Tsay, Kim, Haith, et al., 2022). We are not able to come up with any formulation of this hypothesized biasing effect based on Bayesian cue integration principles. Target and hand are two separate perceived items; how one relates to another needs justification from a normative perspective when discussing Bayesian models. Note this is not a problem for our PEA models, in which both cues used are about hand localization, one is and the other is xp.

      We believe that mathematically formulating the biasing effect (Figure 4A) is non-trivial since the reported hand location changes continuously from negative to positive. Thus, quantitative model predictions, like the ones our PEA model presents here, are needed.

      To rigorously test the possible effect of visual recalibration of the target, there are two things to do: 1) use the psychometric method to measure the biased perception of the target, and 2) re-do Tsay et al. 2020 experiment without the target. For 2), compared to the case with the target, the PEA model would predict a larger overshoot, while the PReMo would predict a smaller overshoot or even zero overshoot. This can be left for future studies.

      (2) Effect of Visual Uncertainty on Error Size:

      I appreciate the authors' response about methodological differences between the cursor cloud used in previous studies and the Gaussian blob used in the current study. However, it is still not clear to me how the authors reconcile previous studies showing that visual uncertainty reduced implicit adaptation for small but not large errors (Tsay et al, 2021; Makino, et al 2023) with the current findings, where visual uncertainty reduced implicit adaptation for large but not small errors.

      Could the authors connect the dots here: I could see that the cursor cloud increases potential overlap with the visual target when the visual error is small, resulting in intrinsic reward-like mechanisms (Kim et al, 2019), which could potentially explain attenuated implicit adaptation for small visual errors. However, why would implicit adaptation in response to large visual errors remain unaffected by the cursor cloud? Note that we did verify that sigma_v is increased in (Tsay et al. 2021), so it is unlikely due to the cloud simply failing as a manipulation of visual uncertainty.

      In addition, we also reasoned that testing individuals with low vision could offer a different test of visual uncertainty (Tsay et al, 2023). The advantage here is that both control and patients with low vision are provided with the same visual input-a single cursor. Our findings suggest that uncertainty due to low vision also shows reduced implicit adaptation in response to small but not large errors, contrary to the findings in the current paper. Missing in the manuscript is a discussion related to why the authors' current findings contradict those of previous results.

      For connecting the dots for two previous studies (Tsay et al., 2021, 2023); Note Makino et al., 2023 is not in this discussion since it investigated the weights of multiple cursors, as opposed to visual uncertainty associated with a cursor cloud):

      First, we want to re-emphasize that using the cursor cloud to manipulate visual uncertainty brings some confounds, making it not ideal for studying visuomotor adaptation. For example, in the error clamp paradigm, the error is defined as angular deviation. The cursor cloud consists of multiple cursors spanning over a range of angles, which affects both the sensory uncertainty (the intended outcome) and the sensory estimate of angles (the error estimate, the undesired outcome). In Bayesian terms, the cursor cloud aims to modulate the sigma of a distribution (σv) in our model), but it additionally affects the mean of the distribution (µ). This unnecessary confound is neatly avoided by using cursor blurring, which is still a cursor with its center (µ) unchanged from a single cursor. Furthermore, as correctly pointed out in the original paper by Tsay et al., 2020, the cursor cloud often overlaps with the visual target; this "target hit" would affect adaptation, possibly via a reward learning mechanism (Kim et al., 2019). This is a second confound that accompanies the cursor cloud. Yes, the cursor cloud was verified as associated with high visual uncertainty (Tsay et al., 2021); this verification was done with a psychophysics method with a clean background, not in the context of a hand reaching a target that is needed. Thus, despite the cursor cloud having a sizeable visual uncertainty, our criticisms for it still hold when used in error-clamp adaptation.

      Second, bearing these confounds of the cursor cloud in mind, we postulate one important factor that has not been considered in any models thus far that might underlie the lack of difference between the single-cursor clamp and the cloud-cursor clamp when the clamp size is large: the cursor cloud might be harder to ignore than a single cursor. For Bayesian sensory integration, the naive model is to consider the relative reliability of cues only. Yes, the cloud is more uncertain in terms of indicating the movement direction than a single cursor. However, given its large spread, it is probably harder to ignore during error-clamp movements. Note that ignoring the clamped cursor is the task instruction, but the large scatter of the cursor cloud is more salient and thus plausible and harder to ignore. This might increase the weighting of the visual cue despite its higher visual uncertainty. This extra confound is arguably minimized by using the blurred cursor as in our Exp4 since the blurred cursor did not increase the visual angle much (Figure 5D; blurred vs single cursor: 3.4mm vs 2.5mm in radius, 3.90o vs  2.87o in spread). In contrast, the visual angle of the dot cloud is at least a magnitude larger (cursor cloud vs. single cursor: at least 25o vs. 2.15o in the spread, given a 10o standard deviation of random sampling).

      Third, for the low-vision study (Tsay et al., 2023), the patients indeed show reduced implicit adaptation for a 3 o clamp (consistent with our PEA model) but an intact adaptation for 30-degree clamp (not consistent). Though this pattern appears similar to what happens for normal people whose visual uncertainty is upregulated by cursor cloud (Tsay et al., 2021), we are not completely convinced that the same underlying mechanism governs these two datasets. Low-vision patients indeed have higher visual uncertainty about color, brightness, and object location, but their visual uncertainty about visual motion is still unknown. Due to the difference in impairment among low vision people (e.g., peripheral or central affected) and the different roles of peripheral and central vision in movement planning and control (Sivak & Mackenzie, 1992), it is unclear about the overall effect of visual uncertainty in low vision people. The direction of cursor movement that matters for visuomotor rotation here is likely related to visual motion perception. Unfortunately, the original study did not measure this uncertainty in low-vision patients. We believe our Exp1 offers a valid method for this purpose for future studies. More importantly, we should not expect low-vision patients to integrate visual cues in the same way as normal people, given their long-term adaptation to their vision difficulties. Thus, we are conservative about interpreting the seemingly similar findings across the two studies (Tsay et al., 2021, 2023) as revealing the same mechanism.

      A side note: these two previous studies proposed a so-called mis-localization hypothesis, i.e., the cursor cloud was mislocated for small clamp size (given its overlapping with the target) but not for large clamp size. They suggested that the lack of uncertainty effect at small clamp sizes is due to mislocalization, while the lack of uncertainty effect at large clamp sizes is because implicit adaptation is not sensitive to uncertainty at large angles. Thus, these two studies admit that cursor cloud not only upregulates uncertainty but also generates an unwanted effect of so-called “mis-localization” (overlapping with the target). Interestingly, their hypothesis about less sensitivity to visual uncertainty for large clamps is not supported by a model or theory but merely a re-wording of the experiment results.

      In sum, our current study cannot offer an easy answer to "connect the dots" in the aforementioned two studies due to methodology issues and the specialty of the population. However, for resolving conflicting findings, our study suggests solutions include using a psychometric test to quantify visual uncertainty for cursor motion (Exp1), a better uncertainty-manipulation method to avoid a couple of confounds (Exp4, blurred cursor), and a falsifiable model. Future endeavors can solve the difference between studies based on the new insights from the current.

      Reviewer #2 (Public Review):

      Summary:

      The authors present the Perceptual Error Adaptation (PEA) model, a computational approach offering a unified explanation for behavioral results that are inconsistent with standard state-space models. Beginning with the conventional state-space framework, the paper introduces two innovative concepts. Firstly, errors are calculated based on the perceived hand position, determined through Bayesian integration of visual, proprioceptive, and predictive cues. Secondly, the model accounts for the eccentricity of vision, proposing that the uncertainty of cursor position increases with distance from the fixation point. This elegantly simple model, with minimal free parameters, effectively explains the observed plateau in motor adaptation under the implicit motor adaptation paradigm using the error-clamp method. Furthermore, the authors experimentally manipulate visual cursor uncertainty, a method established in visuomotor studies, to provide causal evidence. Their results show that the adaptation rate correlates with perturbation sizes and visual noise, uniquely explained by the PEA model and not by previous models. Therefore, the study convincingly demonstrates that implicit motor adaptation is a process of Bayesian cue integration

      Strengths:

      In the past decade, numerous perplexing results in visuomotor rotation tasks have questioned their underlying mechanisms. Prior models have individually addressed aspects like aiming strategies, motor adaptation plateaus, and sensory recalibration effects. However, a unified model encapsulating these phenomena with a simple computational principle was lacking. This paper addresses this gap with a robust Bayesian integration-based model. Its strength lies in two fundamental assumptions: motor adaptation's influence by visual eccentricity, a well-established vision science concept, and sensory estimation through Bayesian integration. By merging these well-founded principles, the authors elucidate previously incongruent and diverse results with an error-based update model. The incorporation of cursor feedback noise manipulation provides causal evidence for their model. The use of eye-tracking in their experimental design, and the analysis of adaptation studies based on estimated eccentricity, are particularly elegant. This paper makes a significant contribution to visuomotor learning research.

      The authors discussed in the revised version that the proposed model can capture the general implicit motor learning process in addition to the visuomotor rotation task. In the discussion, they emphasize two main principles: the automatic tracking of effector position and the combination of movement cues using Bayesian integration. These principles are suggested as key to understanding and modeling various motor adaptations and skill learning. The proposed model could potentially become a basis for creating new computational models for skill acquisition, especially where current models fall short.

      Weaknesses:

      The proposed model is described as elegant. In this paper, the authors test the model within a limited example condition, demonstrating its relevance to the sensorimotor adaptation mechanisms of the human brain. However, the scope of the model's applicability remains unclear. It has shown the capacity to explain prior data, thereby surpassing previous models that rely on elementary mathematics. To solidify its credibility in the field, the authors must gather more supporting evidence.

      Indeed, our model here is based on one particular experimental paradigm, i.e., the error-clamp adaptation. We used it simply because 1) this paradigm is one rare example that implicit motor learning can be isolated in a clean way, and 2) there are a few conflicting findings in the literature for us to explain away by using a unified model.

      For our model’s broad impact, we believe that as long as people need to locate their effectors during motor learning, the general principle laid out here will be applicable. In other words, repetitive movements with a Bayesian cue combination of movement-related cues can underlie the implicit process of various motor learning. To showcase its broad impact, in upcoming studies, we will extend this model to other motor learning paradigms, starting from motor adaptation paradigms that involve both explicit and implicit processes.

      Reviewer #3 (Public Review):

      (2.1) Summary

      In this paper, the authors model motor adaptation as a Bayesian process that combines visual uncertainty about the error feedback, uncertainty about proprioceptive sense of hand position, and uncertainty of predicted (=planned) hand movement with a learning and retention rate as used in state space models. The model is built with results from several experiments presented in the paper and is compared with the PReMo model (Tsay, Kim et al., 2022) as well as a cue combination model (Wei & Körding, 2009). The model and experiments demonstrate the role of visual uncertainty about error feedback in implicit adaptation.

      In the introduction, the authors notice that implicit adaptation (as measured in error-clamp based paradigms) does not saturate at larger perturbations, but decreases again (e.g. Moorehead et al., 2017 shows no adaptation at 135{degree sign} and 175{degree sign} perturbations). They hypothesized that visual uncertainty about cursor position increases with larger perturbations since the cursor is further from the fixated target. This could decrease importance assigned to visual feedback which could explain lower asymptotes.

      The authors characterize visual uncertainty for 3 rotation sizes in a first experiment, and while this experiment could be improved, it is probably sufficient for the current purposes. Then the authors present a second experiment where adaptation to 7 clamped errors are tested in different groups of participants. The models' visual uncertainty is set using a linear fit to the results from experiment 1, and the remaining 4 parameters are then fit to this second data set. The 4 parameters are 1) proprioceptive uncertainty, 2) uncertainty about the predicted hand position, 3) a learning rate and 4) a retention rate. The authors' Perceptual Error Adaptation model ("PEA") predicts asymptotic levels of implicit adaptation much better than both the PReMo model (Tsay, Kim et al., 2022), which predicts saturated asymptotes, or a causal inference model (Wei & Körding, 2007) which predicts no adaptation for larger rotations. In a third experiment, the authors test their model's predictions about proprioceptive recalibration, but unfortunately compare their data with an unsuitable other data set (Tsay et al. 2020, instead of Tsay et al. 2021). Finally, the authors conduct a fourth experiment where they put their model to the test. They measure implicit adaptation with increased visual uncertainty, by adding blur to the cursor, and the results are again better in line with their model (predicting overall lower adaptation), than with the PReMo model (predicting equal saturation but at larger perturbations) or a causal inference model (predicting equal peak adaptation, but shifted to larger rotations). In particular the model fits for experiment 2 and the results from experiment 4 show that the core idea of the model has merit: increased visual uncertainty about errors dampens implicit adaptation.

      (2.2) Strengths

      In this study the authors propose a Perceptual Error Adaptation model ("PEA") and the work combines various ideas from the field of cue combination, Bayesian methods and new data sets, collected in four experiments using various techniques that test very different components of the model. The central component of visual uncertainty is assessed in a first experiment. The model uses 4 other parameters to explain implicit adaptation. These parameters are: 1) a learning and 2) a retention rate, as used in popular state space models and the uncertainty (variance) of 3) predicted and 4) proprioceptive hand position. In particular, the authors observe that asymptotes for implicit learning do not saturate, as claimed before, but decrease again when rotations are very large and that this may have to do with visual uncertainty (e.g. Tsay et al., 2021, J Neurophysiol 125, 12-22). The final experiment confirms predictions of the fitted model about what happens when visual uncertainty is increased (overall decrease of adaptation). By incorporating visual uncertainty depending on retinal eccentricity, the predictions of the PEA model for very large perturbations are notably different from, and better than, the predictions of the two other models it is compared to. That is, the paper provides strong support for the idea that visual uncertainty of errors matters for implicit adaptation.

      (2.3) Weaknesses

      Although the authors don't say this, the "concave" function that shows that adaptation does not saturate for larger rotations has been shown before, including in papers cited in this manuscript.

      For a proper citation of the “concave” adaptation function: we assume the reviewer is referring to the study by Morehead, 2017 which tested large clamp sizes up to 135 o and 175 o. Unsurprisingly, the 135 o and 175 o conditions lead to nearly zero adaptation, possibly due to the trivial fact that people cannot even see the moving cursor. We have quoted this seminar study from the very beginning. All other error-clamp studies with a block design emphasized an invariant or saturated implicit adaptation with large rotations (e.g., Kim, et al., 2019).

      The first experiment, measuring visual uncertainty for several rotation sizes in error-clamped paradigms has several shortcomings, but these might not be so large as to invalidate the model or the findings in the rest of the manuscript. There are two main issues we highlight here. First, the data is not presented in units that allow comparison with vision science literature. Second, the 1 second delay between movement endpoint and disappearance of the cursor, and the presentation of the reference marker, may have led to substantial degradation of the visual memory of the cursor endpoint. That is, the experiment could be overestimating the visual uncertainty during implicit adaptation.

      For the issues related to visual uncertainty measurement in Exp1:

      First, our visual uncertainty is about cursor motion direction in the display plane, and the measurement in Exp1 has never been done before. Thus, we do not think our data is comparable to any findings in visual science about fovea/peripheral comparison. We quoted Klein and others’ work (Klein & Levi, 1987; Levi et al., 1987) in vision science since their studies showed that the deviation from the fixation is associated with an increase in visual uncertainty. Their study thus inspired us to conduct Exp1 to probe how our concerned visual uncertainty (specifically for visual motion direction) changes with an increasing deviation from the fixation. Any model and its model parameters should be specifically tailored to the task or context it tries to emulate. In our case, motion direction in a center-out-reaching setting is the modeled context, and all the relevant model parameters should be specified in movement angles. This is particularly important since we need to estimate parameters from one experiment to predict behaviors in another experiment.

      Second, the 1s delay of the reference cursor has minimal impact on the estimate of visual uncertainty based on previous vision studies. Our Exp1 used a similar visual paradigm by (White et al., 1992), which shows that delay does not lead to an increase in visual uncertainty over a broad range of values (from 0.2s to >1s, see their Figure 5-6).

      These two problems have been addressed in the revised manuscript, with proper citations listed.

      The paper's third experiment relies to a large degree on reproducing patterns found in one particular paper, where the reported hand positions - as a measure of proprioceptive sense of hand position - are given and plotted relative to an ever present visual target, rather than relative to the actual hand position. That is, 1) since participants actively move to a visual target, the reported hand positions do not reflect proprioception, but mostly the remembered position of the target participants were trying to move to, and 2) if the reports are converted to a difference between the real and reported hand position (rather than the difference between the target and the report), those would be on the order of ~20° which is roughly two times larger than any previously reported proprioceptive recalibration, and an order of magnitude larger than what the authors themselves find (1-2°) and what their model predicts. Experiment 3 is perhaps not crucial to the paper, but it nicely provides support for the idea that proprioceptive recalibration can occur with error-clamped feedback.

      Reviewer 3 thinks Tsay 2020 dataset is not appropriate for our theorization, but we respectfully disagree. For the three points raised here, we would like to elaborate:

      (1) As we addressed in the previous response, the reported hand location in Figure 4A (Tsay et al., 2020) is not from a test of proprioceptive recalibration as conventionally defined. In the revision, we explicitly state that this dataset is not about proprioceptive recalibration and also delete texts that might mislead people to think so (see Results section). Instead, proprioceptive recalibration is measured by passive movement, as in our Exp3 (Figure 4E). For error-clamp adaptation here, "the remembered position of the target" is the target. Clearly, the participants did not report the target position, which is ever-present. Instead, their reported hand location shows an interestingly continuous change with ongoing adaptation.

      (2) Since the Tsay 2020 dataset is not a so-called proprioceptive recalibration, we need not take the difference between the reported location and the actual hand location. Indeed, the difference would be ~20 degrees, but comparing it to the previously reported proprioceptive recalibration is like comparing apples to oranges. In fact, throughout the paper, we refer to the results in Fig 4A as “reported hand location”, not proprioceptive recalibration. The target direction is defined as zero degree thus its presence will not bias the reported hand in the Bayesian cue combination (as this visual cue has a mean value of 0). Using the target as the reference also simplifies our modeling.

      (3) Exp3 is crucial for our study since it shows our model and its simple Bayesian cue combination principle are applicable not only to implicit adaptation but also to proprioceptive measures during adaptation. Furthermore, it reproduced the so-called proprioceptive recalibration and explained it away with the same Bayesian cue combination as the adaptation. We noticed that this field has accumulated an array of findings on proprioceptive changes induced by visuomotor adaptation. However, currently, there is a lack of a computational model to quantitatively explain them. Our study at least made an initial endeavor to model these changes.

      Perhaps the largest caveat to the study is that it assumes that people do not look at the only error feedback available to them (and can explicitly suppress learning from it). This was probably true in the experiments used in the manuscript, but unlikely to be the case in most of the cited literature. Ignoring errors and suppressing adaptation would also be a disastrous strategy to use in the real world, such that our brains may not be very good at this. So the question remains to what degree - if any - the ideas behind the model generalize to experiments without fixation control, and more importantly, to real life situations.

      The largest caveat raised by the reviewer appears to be directed to the error-clamp paradigm in general, not only to our particular study. In essence, this paradigm indeed requires participants to ignore the clamped error; thus, its induced adaptive response can be attributed to implicit adaptation. The original paper that proposed this paradigm (Morehead et al., 2017) has been cited 220 times (According to Google Scholar, at the time of this writing, 06/2024), indicating that the field has viewed this paradigm in a favorable way.

      Furthermore, we agree that this kind of instruction and feedback (invariant clamp) differ from daily life experience, but it does not prevent us from gaining theoretical insights by studying human behaviors under this kind of "artificial" task setting. Thinking of the saccadic adaptation (Deubel, 1987; Kojima et al., 2004): jumping the target while the eye moves towards it, and this somewhat artificial manipulation again makes people adapt implicitly, and the adaptation itself is a "disastrous" strategy for real-life situations. However, scientists have gained an enormous understanding of motor adaptation using this seemingly counterproductive adaptation in real life. Also, think of perceptual learning of task-irrelevant stimuli (Seitz & Watanabe, 2005, 2009): when participants are required to learn to discriminate one type of visual stimuli, the background shows another type of stimuli, which people gradually learn even though they do not even notice its presence. This "implicit" learning can be detrimental to our real life, too, but the paradigm itself has advanced our understanding of the inner workings of the cognitive system.

      Recommendations for the authors:

      Reviewer #2 (Recommendations For The Authors):

      L101: There is a typo: (Tsay et al., 2020), 2020) should be corrected to (Tsay et al., 2020).

      Thanks for pointing it out, we corrected this typo.

      L224-228: It would be beneficial to evaluate the validity of the estimated sigma_u and sigma_p based on previous reports.

      We can roughly estimate σu by evaluating the variability of reaching angles during the baseline phase when no perturbation is applied. The standard deviation of the reaching angle in Exp 2 is 5.128o±0.190o, which is close to the σu estimated by the model (5.048o). We also used a separate perceptual experiment to test the proprioceptive uncertainty (n = 13, See Figure S6), σp from this experiment is 9.737o±5.598o, also close to the σp extracted by the model (11.119o). We added these new analysis results to the final version of the paper.

      L289-298: I found it difficult to understand the update equations of the proprioceptive calibration based on the PEA model. Providing references to the equations or better explanations would be helpful.

      We expanded the process of proprioceptive calibration in Supplementary Text 1 with step-by-step equations and more explanations. 

      Reviewer #3 (Recommendations For The Authors):

      Suggestions (or clarification of previous suggestions) for revisions

      The authors persist on using the Tsay et al 2020 paper despite its many drawbacks which the authors attempt to address in their reply. But the main drawback is that the results in the 2020 paper is NOT relative to the unseen hand but to the visual target the participants were supposed to move their hand to. If the results were converted so to be relative to the unseen hand, the localization biases would be over 20 deg in magnitude.

      The PEA simulations are plotted relative to the unseen hand which makes sense. If the authors want to persist using the Tsay 2020 dataset despite any issues, they at least need to make sure that the simulations are mimicking the same change. That is, the data from Tsay 2020 needs to be converted to the same variable used in the current paper.

      If the main objection for using the Tsay 2021 is that the design would lead to forgetting, we found that active localization (or any intervening active movements like no-cursor reach) does lead to some interference or forgetting (a small reduction in overall magnitude of adaptation) this is not the case for passive localization, see Ruttle et al, 2021 (data on osf). This was also just a suggestion, there may of course also be other, more suitable data sets.

      As stated above, changing the reference system is not necessary, nor does it affect our results. Tsay et al 2020 dataset is unique since it shows the gradual change of reported hand location along with error-clamp adaptation. The forgetting (or reduction in proprioceptive bias), even if it exists, would not affect the fitting quality of our model for the Tsay 2020 dataset: if we assume that forgetting is invariant over the adaptation process, the forgetting would only reduce the proprioceptive bias uniformly across trials. This can be accounted for by a smaller weight on . The critical fact is that the model can explain the gradual drift of the proprioceptive judgment of the hand location.

      By the way, Ruttle et al.'s 2021 dataset is not for error-clamp adaptation, and thus we will leave it to test our model extension in the future (after incorporating an explicit process in the model).

      References

      Deubel, H. (1987). Adaptivity of gain and direction in oblique saccades. Eye Movements from Physiology to Cognition. https://www.sciencedirect.com/science/article/pii/B9780444701138500308

      Kim, H. E., Parvin, D. E., & Ivry, R. B. (2019). The influence of task outcome on implicit motor learning. ELife, 8. https://doi.org/10.7554/eLife.39882

      Klein, S. A., & Levi, D. M. (1987). Position sense of the peripheral retina. JOSA A, 4(8), 1543–1553.

      Kojima, Y., Iwamoto, Y., & Yoshida, K. (2004). Memory of learning facilitates saccadic adaptation in the monkey. The Journal of Neuroscience: The Official Journal of the Society for Neuroscience, 24(34), 7531–7539.

      Levi, D. M., Klein, S. A., & Yap, Y. L. (1987). Positional uncertainty in peripheral and amblyopic vision. Vision Research, 27(4), 581–597.

      Morehead, J. R., Taylor, J. A., Parvin, D. E., & Ivry, R. B. (2017). Characteristics of implicit sensorimotor adaptation revealed by task-irrelevant clamped feedback. Journal of Cognitive Neuroscience, 29(6), 1061–1074.

      Seitz, & Watanabe. (2005). A unified model for perceptual learning. Trends in Cognitive Sciences, 9(7), 329–334.

      Seitz, & Watanabe. (2009). The phenomenon of task-irrelevant perceptual learning. Vision Research, 49(21), 2604–2610.

      Sivak, B., & Mackenzie, C. L. (1992). Chapter 10 The Contributions of Peripheral Vision and Central Vision to Prehension. In L. Proteau & D. Elliott (Eds.), Advances in Psychology (Vol. 85, pp. 233–259). North-Holland.

      Tsay, J. S., Avraham, G., Kim, H. E., Parvin, D. E., Wang, Z., & Ivry, R. B. (2021). The effect of visual uncertainty on implicit motor adaptation. Journal of Neurophysiology, 125(1), 12–22.

      Tsay, J. S., Kim, H. E., Saxena, A., Parvin, D. E., Verstynen, T., & Ivry, R. B. (2022). Dissociable use-dependent processes for volitional goal-directed reaching. Proceedings. Biological Sciences / The Royal Society, 289(1973), 20220415.

      Tsay, J. S., Kim, H., Haith, A. M., & Ivry, R. B. (2022). Understanding implicit sensorimotor adaptation as a process of proprioceptive re-alignment. ELife, 11, e76639.

      Tsay, J. S., Parvin, D. E., & Ivry, R. B. (2020). Continuous reports of sensed hand position during sensorimotor adaptation. Journal of Neurophysiology, 124(4), 1122–1130.

      Tsay, J. S., Tan, S., Chu, M. A., Ivry, R. B., & Cooper, E. A. (2023). Low Vision Impairs Implicit Sensorimotor Adaptation in Response to Small Errors, But Not Large Errors. Journal of Cognitive Neuroscience, 35(4), 736–748.

      White, J. M., Levi, D. M., & Aitsebaomo, A. P. (1992). Spatial localization without visual references. Vision Research, 32(3), 513–526.

      The following is the authors’ response to the original reviews.

      eLife assessment

      This study presents a valuable finding on the influence of visual uncertainty and Bayesian cue combination on implicit motor adaptation in young healthy participants. The evidence supporting the claims of the authors is solid, although a better discussion of the link between the model variables and the outcomes of related behavioral experiments would strengthen the conclusions. The work will be of interest to researchers in sensory cue integration and motor learning.

      Public Reviews:

      Reviewer #1 (Public Review):

      This valuable study demonstrates a novel mechanism by which implicit motor adaptation saturates for large visual errors in a principled normative Bayesian manner. Additionally, the study revealed two notable empirical findings: visual uncertainty increases for larger visual errors in the periphery, and proprioceptive shifts/implicit motor adaptation are non-monotonic, rather than ramp-like. This study is highly relevant for researchers in sensory cue integration and motor learning. However, I find some areas where statistical quantification is incomplete, and the contextualization of previous studies to be puzzling.

      Thank you for your feedback and the positive highlights of our study. We appreciate your insights and will address the concerns in our revisions.

      Issue #1: Contextualization of past studies.

      While I agree that previous studies have focused on how sensory errors drive motor adaptation (e.g., Burge et al., 2008; Wei and Kording, 2009), I don't think the PReMo model was contextualized properly. Indeed, while PReMo should have adopted clearer language - given that proprioception (sensory) and kinaesthesia (perception) have been used interchangeably, something we now make clear in our new study (Tsay, Chandy, et al. 2023) - PReMo's central contribution is that a perceptual error drives implicit adaptation (see Abstract): the mismatch between the felt (perceived) and desired hand position. The current paper overlooks this contribution. I encourage the authors to contextualize PReMo's contribution more clearly throughout. Not mentioned in the current study, for example, PReMo accounts for the continuous changes in perceived hand position in Figure 4 (Figure 7 in the PReMo study).

      There is no doubt that the current study provides important additional constraints on what determines perceived hand position: Firstly, it offers a normative Bayesian perspective in determining perceived hand position. PReMo suggests that perceived hand position is determined by integrating motor predictions with proprioception, then adding a proprioceptive shift; PEA formulates this as the optimal integration of these three inputs. Secondly, PReMo assumed visual uncertainty to remain constant for different visual errors; PEA suggests that visual uncertainty ought to increase (but see Issue #2).

      Thank you for the comments and suggestions. We have now incorporated the citation for (Tsay et al., 2024), to acknowledge their clarification on the terms of perceptual error. We also agree that our model differs in two fundamental ways. One is to ditch the concept of proprioceptive shift and its contribution to the perceived hand location; instead, we resort to a “one-shot” integration of three types of cues with Bayesian rules. This is a more elegant and probably more ecological way of processing hand location per Occam's Razor. The second essential change is to incorporate the dependency of visual uncertainty on perturbation size into the model, as opposed to resorting to a ramp function of proprioceptive changes relative to perturbation size. The ramp function is not well grounded in perception studies. Yes, we acknowledged that PReMo is the first to recognize the importance of perceptual error, but highlighted the model differences in our Discussion.

      We also think the PReMo model has the potential to explain Fig 4A. But the Tsay et al., 2022 paper assumes that “a generic shift in visual space” explains the gradual proprioceptive changes from negative to positive (see page 17 in Tsay et al., 2022). We do not think that evoking this visual mechanism is necessary to explain Fig 4A; instead, the proprioceptive change is a natural result of hand deviations during implicit adaptation. As the hand moves away from the target (in the positive direction) during adaptation, the estimated hand location goes alone with it. We believe this is the correct way of explaining Fig4A results. As we played around with the PReMo model, we found it is hard to use visual shift to explain this part of data without additional assumptions (at least not with the ones published in Tsay et al., 2022). Furthermore, our PEA model also parsimoniously explains away the proprioceptive shift observed in a completely different setting, i,e., the proprioceptive changes measured by the passive method as a function of perturbation size in Exp 3.

      We expanded the discussion about the comparison between the two models, especially about their different views for explaining Fig4A.

      Issue #2: Failed replication of previous results on the effect of visual uncertainty.

      (2a) A key finding of this paper is that visual uncertainty linearly increases in the periphery; a constraint crucial for explaining the non-monotonicity in implicit adaptation. One notable methodological deviation from previous studies is the requirement to fixate on the target: Notably, in the current experiments, participants were asked to fixate on the target, a constraint not imposed in previous studies. In a free-viewing environment, visual uncertainty may not attenuate as fast, and hence, implicit adaptation does not attenuate as quickly as that revealed in the current design with larger visual errors. Seems like this current fixation design, while important, needs to be properly contextualized considering how it may not represent most implicit adaptation experiments.

      First, we don’t think there is any previous study that examined visual uncertainty as a function of perturbation size. Thus, we do not have a replication problem here. Secondly, our data indicate that even without asking people to fixate on the target, people still predominantly fixate on the target during error-clamp adaptation (when they are “free” viewing). For our Exp 1, the fixation on the straight line between the starting position and the target is 86%-95% (as shown in Figure S1 now, also see below). We also collected eye-tracking data in Exp 4, which is a typical error-clamp experiment. More than 95% fall with +/- 50 pixels around the center of the screen, even slightly higher than Exp 1. This is well understandable: the typical error-clamp adaptation requires people to ignore the cursor and move the hand towards the target. To minimize the interference of the concurrently moving cursor, people depend on the fixation on the target, the sole task-relevant visual marker in the workspace, to achieve the task goal.

      In sum, forcing the participants to fixate on the target is not because we aimed to make up the linear dependency of visual uncertainty; we required them to do so to mimic the eye-tracking pattern in typical error-clamp learning, which has been revealed in our pilot experiment. The visual uncertainty effect is sound, our study is the first to clearly demonstrate it.

      Author response image 1.

      On a side note (but an important one), the high percentage of fixation on the aiming target is also true for conventional visuomotor rotation, which involves strategic re-aiming (shown in Bromberg et al., 2019; de Brouwer et al., 2018, we have an upcoming paper to show this). This is one reason that our new theory would also be applicable to other types of motor adaptation.

      (2b) Moreover, the current results - visual uncertainty attenuates implicit adaptation in response to large, but not small, visual errors - deviates from several past studies that have shown that visual uncertainty attenuates implicit adaptation to small, but not large, visual errors (Tsay, Avraham, et al. 2021; Makino, Hayashi, and Nozaki, n.d.; Shyr and Joshi 2023). What do the authors attribute this empirical difference to? Would this free-viewing environment also result in the opposite pattern in the effect of visual uncertainty on implicit adaptation for small and large visual errors?

      We don’t think all the mentioned previous studies manipulated the visual uncertainty in a parametric way, and none of them provided quantitative measures of visual uncertainty. As we detailed in our Exp4 and in our Discussion, we don’t think Tsay et al., 2021 paper’s manipulation of visual uncertainty is appropriate (see below for 2d). Makino et al., 2023 study used multiple clamped cursors to perturb people, and its effect is not easily accountable since additional processes might be invoked given this kind of complex visual feedback. More importantly, we do not think this is a direct way of modulating visual uncertainty, nor did they provide any evidence.

      (2c) In the current study, the measure of visual uncertainty might be inflated by brief presentation times of comparison and referent visual stimuli (only 150 ms; our previous study allowed for a 500 ms viewing time to make sure participants see the comparison stimuli). Relatedly, there are some individuals whose visual uncertainty is greater than 20 degrees standard deviation. This seems very large, and less likely in a free-viewing environment.

      For our 2AFC, the reference stimulus is the actual clamped cursor, which lasts for 800 ms. The comparison stimulus is a 150-ms dot representation appearing near the reference. For measuring perception of visual motion, this duration is sufficient as previous studies used similar durations (Egly & Homa, 1984; Owsley et al., 1995). We think the 20-degree standard deviation is reasonable given that people fixate on the target, with only peripheral vision to process the fast moving cursor. The steep linear increase in visual uncertainty about visual motion is well documented. The last author of this paper has shown that the uncertainty of visual motion speed (though not about angels) follows the same steep trend (Wei et al., 2010). It is noteworthy that without using our measured visual uncertainty in Exp1, if we fit the adaptation data in Exp2 to “estimate” the visual uncertainty, they are in fact well aligned with each other (see Figure S7 and Supplementary Text 2). This is a strong support that our estimation is valid and accurate. We think this high visual uncertainty is an important message to the field. Thus we now highlighted its magnitude in our Discussion.

      (2d) One important confound between clear and uncertain (blurred) visual conditions is the number of cursors on the screen. The number of cursors may have an attenuating effect on implicit adaptation simply due to task-irrelevant attentional demands (Parvin et al. 2022), rather than that of visual uncertainty. Could the authors provide a figure showing these blurred stimuli (gaussian clouds) in the context of the experimental paradigm? Note that we addressed this confound in the past by comparing participants with and without low vision, where only one visual cursor is provided for both groups (Tsay, Tan, et al. 2023).

      Thank you for raising this important point about types of visual stimuli for manipulating uncertainty. We used Gaussian blur of a single cursor (similar to Burge et al., 2008) instead of a cloud of dots. We now added a figure inset to show how this blur looks.

      Using a cursor cloud Makino et al., 2023; Tsay et al., 2021 to modulate visual uncertainty has inherent drawbacks that make it unsuitable for visuomotor adaptation. For the error clamp paradigm, the error is defined as angular deviation. The cursor cloud consists of multiple cursors spanning over a range of angles, which affects both the sensory uncertainty (the intended outcome) and the sensory estimate of angles (the error estimate, the undesired outcome). In Bayesian terms, the cursor cloud aims to modulate the sigma of a distribution (sigma_v       in         our       model), but it additionally affects the mean of the distribution (mu). This unnecessary confound is avoided by using cursor blurring, which is still a cursor with its center (mu) unchanged from a single cursor. Furthermore, as correctly pointed out in the original paper by Tsay et al., 2021, the cursor cloud often overlaps with the visual target, this “target hit” would affect adaptation, possibly via a reward learning mechanism (See Kim et al., 2019). This is a second confound that accompanies the cursor cloud.

      Issue #3: More methodological details are needed.

      (3a) It's unclear why, in Figure 4, PEA predicts an overshoot in terms of perceived hand position from the target. In PReMo, we specified a visual shift in the perceived target position, shifted towards the adapted hand position, which may result in overshooting of the perceived hand position with this target position. This visual shift phenomenon has been discovered in previous studies (e.g., (Simani, McGuire, and Sabes 2007)).

      Visual shift, as it is called in Simani et al., 2007, is irrelevant for our task here. The data we are modeling are motor adaptation (hand position changes) and so-called proprioceptive changes (hand localization changes), both are measured and referenced in the extrinsic coordinate, not referenced to a visual target. For instance, the proprioceptive changes are either relative to the actual hand location (Exp 3) or relative to the goal (Fig 4A). We also don’t think visual shift is necessary in explaining the perceptual judgment of an unseen hand (the target shown during the judgment indeed has an effect of reducing the biasing effect of PE, see below for responses to reviewer 3).

      In the PEA model, the reported hand angle is the result of integrating cues from the actual hand position and the estimated hand position (x_hand_hat) from previous movements. This integration process leads to the combined reported hand position potentially overshooting or undershooting, depending on the degree of adaptation. It is the changed proprioceptive cue (because the actively moved hand slowly adapted to the error clamp) leading to the overshoot of the perceived hand position.

      In Results, we now explain these value changes with parentheses. Model details about the mechanisms of cue combination and model predictions can be found in Supplementary Text 1. We believe these detailed explanations can make this apparent.

      (3b) The extent of implicit adaptation in Experiment 2, especially with smaller errors, is unclear. The implicit adaptation function seems to be still increasing, at least by visual inspection. Can the authors comment on this trend, and relatedly, show individual data points that help the reader appreciate the variability inherent to these data?

      Indeed, the adaptation for small errors appears not completely saturated with our designated number of trials. However, this will not affect our model analysis. Our model fitting for PEA and other competing models is done on the time-series of adaptation, not on the saturated adaptation extent (see Fig 3A). Thus, despite that some conditions might not produce the full range of adaptation, the data is sufficient to constrain the models. We now mention this concern in Results; we also emphasize that the model not only explains the adaptation magnitude (operationally defined as adaptation extent measured at the same time, i.e., the end of the adaptation phase) but also the full learning process.

      In response, we have included individual data points in the revised Figure 3B-D to provide a clear illustration of the extent of implicit adaptation, particularly for small perturbations.

      (3c) The same participants were asked to return for multiple days/experiments. Given that the authors acknowledge potential session effects, with attenuation upon re-exposure to the same rotation (Avraham et al. 2021), how does re-exposure affect the current results? Could the authors provide clarity, perhaps a table, to show shared participants between experiments and provide evidence showing how session order may not be impacting results?

      Thank you for raising the issue of session and re-exposure effects. First, we don’t think Exp1 has an effect on Exp4. Exp1 is a perceptual task and Exp4 is a motor adaptation task. Furthermore, Exp1 used random visual stimuli on both sides, thus it did not lead to any adaptation effect on its own. Second, Exp4 indeed had three sessions performed on three days, but the session effect does not change our main conclusion about the visual uncertainty. We used a 3-way repeated-measures anova (3 day x 3 perturbation x 2 visual uncertainty) revealed a significant main effect of day (F(2,36) = 17.693, p<0.001), indicating changes in performance across sessions (see Figure below). Importantly, the effects of perturbation and visual uncertainty (including their interactions) remain the same. The day factor did not interact with them. The main effect of day shows that the overall adaptation effect is reduced across days. Post-hoc pairwise comparisons elucidated that single-trial learning (STL) performance on Day 1 was significantly higher than on Day 2 (p = 0.004) and Day 3 (p < 0.001), with no significant difference between Day 2 and Day 3 (p = 0.106). Other ANOVA details: significant main effects for perturbation (F(1,36) = 8.872, p<0.001) and visual uncertainty (F(1,18) = 49.164, p<0.001), as well as a significant interaction between perturbation size and visual uncertainty (F(2,36) = 5.160, p = 0.013). There were no significant interactions involving the day factor with any other factors (all p > 0.182). Thus, the overall adaptation decreases over the days, but the day does not affect our concerned interaction effect of visual uncertainty and perturbation. The fact that their interaction preserved over different sessions strengthened our conclusion about how visual uncertainty systematically affects implicit adaptation.

      Author response image 2.

      (3d) The number of trials per experiment should be detailed more clearly in the Methods section (e.g., Exp 4). Moreover, could the authors please provide relevant code on how they implemented their computational models? This would aid in future implementation of these models in future work. I, for one, am enthusiastic to build on PEA.

      We have clarified the number of trials conducted in each experiment, with detailed information now readily available in the Methods section of the main text. In addition, we have made the code for data analysis and modeling publicly accessible. These resources can be found in the updated "Data Availability" section of our paper.

      (3f) In addition to predicting a correlation between proprioceptive shift and implicit adaptation on a group level, both PReMo and PEA (but not causal inference) predict a correlation between individual differences in proprioceptive shift and proprioceptive uncertainty with the extent of implicit adaptation (Tsay, Kim, et al. 2021). Interestingly, shift and uncertainty are independent (see Figures 4F and 6C in Tsay et al, 2021). Does PEA also predict independence between shift and uncertainty? It seems like PEA does predict a correlation.

      Thank you for addressing this insightful question. Our PEA model indeed predicts a positive correlation (although not linear) between the proprioceptive uncertainty and the amplitude of the estimated hand position (x_hand_hat). This prediction is consistent with the simulations conducted, using the same parameters that were applied to generate the results depicted in

      Figure 4B of our manuscript (there is a sign flip as x_hand_hat is negative).

      Author response image 3.

      Regarding the absence of a correlation observed in Tsay et al., 2021, we offer several potential explanations for this discrepancy. First, the variability observed in passive hand localization during motor adaptation (as in Tsay et al., 2021) does not directly equal proprioceptive uncertainty, which typically requires psychophysical testing to accurately assess. Second, our study showed that the proprioceptive bias attenuates during the repetitive measurements; in our Exp3, it decreased within a block of three trials. We noticed that Tsay et al., 2021 study used 36 measurements in a row without interleaving adaptation trials. Thus, the “averaged” proprioceptive bias in Tsay’s study might not reflect the actual bias during adaptation. We also noticed that that study showed large individual differences in both proprioceptive bias and proprioceptive variability (not uncertainty), thus getting a positive result, if it were really there, would require a large number of participants, probably larger than their n=30ish sample size. These putative explanations are not put in the revision, which already has a long discussion and has no space for discussing about a null result.

      Reviewer #2 (Public Review):

      Summary:

      The authors present the Perceptual Error Adaptation (PEA) model, a computational approach offering a unified explanation for behavioral results that are inconsistent with standard state-space models. Beginning with the conventional state-space framework, the paper introduces two innovative concepts. Firstly, errors are calculated based on the perceived hand position, determined through Bayesian integration of visual, proprioceptive, and predictive cues. Secondly, the model accounts for the eccentricity of vision, proposing that the uncertainty of cursor position increases with distance from the fixation point. This elegantly simple model, with minimal free parameters, effectively explains the observed plateau in motor adaptation under the implicit motor adaptation paradigm using the error-clamp method. Furthermore, the authors experimentally manipulate visual cursor uncertainty, a method established in visuomotor studies, to provide causal evidence. Their results show that the adaptation rate correlates with perturbation sizes and visual noise, uniquely explained by the PEA model and not by previous models. Therefore, the study convincingly demonstrates that implicit motor adaptation is a process of Bayesian cue integration

      Strengths:

      In the past decade, numerous perplexing results in visuomotor rotation tasks have questioned their underlying mechanisms. Prior models have individually addressed aspects like aiming strategies, motor adaptation plateaus, and sensory recalibration effects. However, a unified model encapsulating these phenomena with a simple computational principle was lacking. This paper addresses this gap with a robust Bayesian integration-based model. Its strength lies in two fundamental assumptions: motor adaptation's influenced by visual eccentricity, a well-established vision science concept, and sensory estimation through Bayesian integration. By merging these well-founded principles, the authors elucidate previously incongruent and diverse results with an error-based update model. The incorporation of cursor feedback noise manipulation provides causal evidence for their model. The use of eye-tracking in their experimental design, and the analysis of adaptation studies based on estimated eccentricity, are particularly elegant. This paper makes a significant contribution to visuomotor learning research.

      Weaknesses:

      The paper provides a comprehensive account of visuomotor rotation paradigms, addressing incongruent behavioral results with a solid Bayesian integration model. However, its focus is narrowly confined to visuomotor rotation, leaving its applicability to broader motor learning paradigms, such as force field adaptation, saccadic adaptation, and de novo learning paradigms, uncertain. The paper's impact on the broader fields of neuroscience and cognitive science may be limited due to this specificity. While the paper excellently demonstrates that specific behavioral results in visuomotor rotation can be explained by Bayesian integration, a general computational principle, its contributions to other motor learning paradigms remain to be explored. The paper would benefit from a discussion on the model's generality and its limitations, particularly in relation to the undercompensating effects in other motor learning paradigms.

      Thank you for your thoughtful review and recognition of the contributions our work makes towards understanding implicit motor adaptation through the Perceptual Error Adaptation (PEA) model. We appreciate your suggestion to broaden the discussion about the model's applicability beyond the visuomotor rotation paradigm, a point we acknowledge was not sufficiently explored in our initial discussion.

      Our model is not limited to the error-clamp adaptation, where the participants were explicitly told to ignore the rotated cursor. The error-clamp paradigm is one rare example that implicit motor learning can be isolated in a nearly idealistic way. Our findings thus imply two key aspects of implicit adaptation: 1) localizing one’s effector is implicitly processed and continuously used to update the motor plan; 2) Bayesian cue combination is at the core of integrating movement feedback and motor-related cues (motor prediction cue in our model) when forming procedural knowledge for action control.

      We will propose that the same two principles should be applied to various kinds of motor adaptation and motor skill learning, which constitutes motor learning in general. Most of our knowledge about motor adaptation is from visuomotor rotation, prism adaptation, force field adaptation, and saccadic adaptation. The first three types all involve localizing one’s effector under the influence of perturbed sensory feedback, and they also have implicit learning. We believe they can be modeled by variants of our model, or at least should consider using the two principles we laid out above to think of their computational nature. For skill learning, especially for de novo learning, the area still lacks a fundamental computational model that accounts for skill acquisition process on the level of relevant movement cues. Our model suggests a promising route, i.e., repetitive movements with a Bayesian cue combination of movement-related cues might underlie the implicit process of motor skills.

      We added more discussion on the possible broad implications of our model in the revision.

      Reviewer #3 (Public Review):

      Summary

      In this paper, the authors model motor adaptation as a Bayesian process that combines visual uncertainty about the error feedback, uncertainty about proprioceptive sense of hand position, and uncertainty of predicted (=planned) hand movement with a learning and retention rate as used in state space models. The model is built with results from several experiments presented in the paper and is compared with the PReMo model (Tsay, Kim, et al., 2022) as well as a cue combination model (Wei & Körding, 2009). The model and experiments demonstrate the role of visual uncertainty about error feedback in implicit adaptation.

      In the introduction, the authors notice that implicit adaptation (as measured in error-clamp-based paradigms) does not saturate at larger perturbations, but decreases again (e.g. Moorehead et al., 2017 shows no adaptation at 135{degree sign} and 175{degree sign} perturbations). They hypothesized that visual uncertainty about cursor position increases with larger perturbations since the cursor is further from the fixated target. This could decrease the importance assigned to visual feedback which could explain lower asymptotes.

      The authors characterize visual uncertainty for 3 rotation sizes in the first experiment, and while this experiment could be improved, it is probably sufficient for the current purposes. Then the authors present a second experiment where adaptation to 7 clamped errors is tested in different groups of participants. The models' visual uncertainty is set using a linear fit to the results from experiment 1, and the remaining 4 parameters are then fit to this second data set. The 4 parameters are 1) proprioceptive uncertainty, 2) uncertainty about the predicted hand position, 3) a learning rate, and 4) a retention rate. The authors' Perceptual Error Adaptation model ("PEA") predicts asymptotic levels of implicit adaptation much better than both the PReMo model (Tsay, Kim et al., 2022), which predicts saturated asymptotes, or a causal inference model (Wei & Körding, 2007) which predicts no adaptation for larger rotations. In a third experiment, the authors test their model's predictions about proprioceptive recalibration, but unfortunately, compare their data with an unsuitable other data set. Finally, the authors conduct a fourth experiment where they put their model to the test. They measure implicit adaptation with increased visual uncertainty, by adding blur to the cursor, and the results are again better in line with their model (predicting overall lower adaptation) than with the PReMo model (predicting equal saturation but at larger perturbations) or a causal inference model (predicting equal peak adaptation, but shifted to larger rotations). In particular, the model fits experiment 2 and the results from experiment 4 show that the core idea of the model has merit: increased visual uncertainty about errors dampens implicit adaptation.

      Strengths

      In this study, the authors propose a Perceptual Error Adaptation model ("PEA") and the work combines various ideas from the field of cue combination, Bayesian methods, and new data sets, collected in four experiments using various techniques that test very different components of the model. The central component of visual uncertainty is assessed in the first experiment. The model uses 4 other parameters to explain implicit adaptation. These parameters are 1) learning and 2) retention rate, as used in popular state space models, and the uncertainty (variance) of 3) predicted and 4) proprioceptive hand position. In particular, the authors observe that asymptotes for implicit learning do not saturate, as claimed before, but decrease again when rotations are very large and that this may have to do with visual uncertainty (e.g. Tsay et al., 2021, J Neurophysiol 125, 12-22). The final experiment confirms predictions of the fitted model about what happens when visual uncertainty is increased (overall decrease of adaptation). By incorporating visual uncertainty depending on retinal eccentricity, the predictions of the PEA model for very large perturbations are notably different from and better than, the predictions of the two other models it is compared to. That is, the paper provides strong support for the idea that visual uncertainty of errors matters for implicit adaptation.

      Weaknesses

      Although the authors don't say this, the "concave" function that shows that adaptation does not saturate for larger rotations has been shown before, including in papers cited in this manuscript.

      The first experiment, measuring visual uncertainty for several rotation sizes in error-clamped paradigms has several shortcomings, but these might not be so large as to invalidate the model or the findings in the rest of the manuscript. There are two main issues we highlight here. First, the data is not presented in units that allow comparison with vision science literature. Second, the 1 second delay between the movement endpoint and the disappearance of the cursor, and the presentation of the reference marker, may have led to substantial degradation of the visual memory of the cursor endpoint. That is, the experiment could be overestimating the visual uncertainty during implicit adaptation.

      The paper's third experiment relies to a large degree on reproducing patterns found in one particular paper, where the reported hand positions - as a measure of proprioceptive sense of hand position - are given and plotted relative to an ever-present visual target, rather than relative to the actual hand position. That is, 1) since participants actively move to a visual target, the reported hand positions do not reflect proprioception, but mostly the remembered position of the target participants were trying to move to, and 2) if the reports are converted to a difference between the real and reported hand position (rather than the difference between the target and the report), those would be on the order of ~20{degree sign} which is roughly two times larger than any previously reported proprioceptive recalibration, and an order of magnitude larger than what the authors themselves find (1-2{degree sign}) and what their model predicts. Experiment 3 is perhaps not crucial to the paper, but it nicely provides support for the idea that proprioceptive recalibration can occur with error-clamped feedback.

      Perhaps the largest caveat to the study is that it assumes that people do not look at the only error feedback available to them (and can explicitly suppress learning from it). This was probably true in the experiments used in the manuscript, but unlikely to be the case in most of the cited literature. Ignoring errors and suppressing adaptation would also be a disastrous strategy to use in the real world, such that our brains may not be very good at this. So the question remains to what degree - if any - the ideas behind the model generalize to experiments without fixation control, and more importantly, to real-life situations.

      Specific comments:

      A small part of the manuscript relies on replicating or modeling the proprioceptive recalibration in a study we think does NOT measure proprioceptive recalibration (Tsay, Parvin & Ivry, JNP, 2020). In this study, participants reached for a visual target with a clamped cursor, and at the end of the reach were asked to indicate where they thought their hand was. The responses fell very close to the visual target both before and after the perturbation was introduced. This means that the difference between the actual hand position, and the reported/felt hand position gets very large as soon as the perturbation is introduced. That is, proprioceptive recalibration would necessarily have roughly the same magnitude as the adaptation displayed by participants. That would be several times larger than those found in studies where proprioceptive recalibration is measured without a visual anchor. The data is plotted in a way that makes it seem like the proprioceptive recalibration is very small, as they plot the responses relative to the visual target, and not the discrepancy between the actual and reported hand position. It seems to us that this study mostly measures short-term visual memory (of the target location). What is astounding about this study is that the responses change over time to begin with, even if only by a tiny amount. Perhaps this indicates some malleability of the visual system, but it is hard to say for sure.

      Regardless, the results of that study do not form a solid basis for the current work and they should be removed. We would recommend making use of the dataset from the same authors, who improved their methods for measuring proprioception shifts just a year later (Tsay, Kim, Parvin, Stover, and Ivry, JNP, 2021). Although here the proprioceptive shifts during error-clamp adaptation (Exp 2) were tiny, and not quite significant (p<0.08), the reports are relative to the actual location of the passively placed unseen hand, measured in trials separate from those with reach adaptation and therefore there is no visual target to anchor their estimates to.

      Experiment 1 measures visual uncertainty with increased rotation size. The authors cite relevant work on this topic (Levi & Klein etc) which has found a linear increase in uncertainty of the position of more and more eccentrically displayed stimuli.

      First, this is a question where the reported stimuli and effects could greatly benefit from comparisons with the literature in vision science, and the results might even inform it. In order for that to happen, the units for the reported stimuli and effects should (also) be degrees of visual angle (dva).

      As far as we know, all previous work has investigated static stimuli, where with moving stimuli, position information from several parts of the visual field are likely integrated over time in a final estimate of position at the end of the trajectory (a Kalman filter type process perhaps). As far as we know, there are no studies in vision science on the uncertainty of the endpoint of moving stimuli. So we think that the experiment is necessary for this study, but there are some areas where it could be improved.

      Then, the linear fit is done in the space of the rotation size, but not in the space of eccentricity relative to fixation, and these do not necessarily map onto each other linearly. If we assume that the eye-tracker and the screen were at the closest distance the manufacturer reports it to work accurately at (45 cm), we would get the largest distances the endpoints are away from fixation in dva. Based on that assumed distance between the participant and monitor, we converted the rotation angles to distances between fixation and the cursor endpoint in degrees visual angle: 0.88, 3.5, and 13.25 dva (ignoring screen curvature, or the absence of it). The ratio between the perturbation angle and retinal distance to the endpoint is roughly 0.221, 0.221, and 0.207 if the minimum distance is indeed used - which is probably fine in this case. But still, it would be better to do fit in the relevant perceptual coordinate system.

      The first distance (4 deg rotation; 0.88 dva offset between fixation and stimulus) is so close to fixation (even at the assumed shortest distance between eye and screen) that it can be considered foveal and falls within the range of noise of eye-trackers + that of the eye for fixating. There should be no uncertainty on or that close to the fovea. The variability in the data is likely just measurement noise. This also means that a linear fit will almost always go through this point, somewhat skewing the results toward linearity. The advantage is that the estimate of the intercept (measurement noise) is going to be very good. Unfortunately, there are only 2 other points measured, which (if used without the closest point) will always support a linear fit. Therefore, the experiment does not seem suitable to test linearity, only to characterize it, which might be sufficient for the current purposes. We'd understand if the effort to do a test of linearity using many more rotations requires too much effort. But then it should be made much clearer that the experiment assumes linearity and only serves to characterize the assumed linearity.

      Final comment after the consultation session:

      There were a lot of discussions about the actual interpretation of the behavioral data from this paper with regards to past papers (Tsay et al. 2020 or 2021), and how it matches the different variables of the model. The data from Tsay 2020 combined both proprioceptive information (Xp) and prediction about hand position (Xu) because it involves active movements. On the other hand, Tsay et al. 2021 is based on passive movements and could provide a better measure of Xp alone. We would encourage you to clarify how each of the variables used in the model is mapped onto the outcomes of the cited behavioral experiments.

      The reviewers discussed this point extensively during the consultation process. The results reported in the Tsay 2020 study reflect both proprioception and prediction. However, having a visual target contributes more than just prediction, it is likely an anchor in the workspace that draws the response to it. Such that the report is dominated by short-term visual memory of the target (which is not part of the model). However, in the current Exp 3, as in most other work investigating proprioception, this is calculated relative to the actual direction.

      The solution is fairly simple. In Experiment 3 in the current study, Xp is measured relative to the hand without any visual anchors drawing responses, and this is also consistent with the reference used in the Tsay et al 2021 study and from many studies in the lab of D. Henriques (none of which also have any visual reach target when measuring proprioceptive estimates). So we suggest using a different data set that also measures Xp without any other influences, such as the data from Tsay et al 2021 instead.

      These issues with the data are not superficial and can not be solved within the model. Data with correctly measured biases (relative to the hand) that are not dominated by irrelevant visual attractors would actually be informative about the validity of the PEA model. Dr. Tsay has so much other that we recommend using a more to-the-point data set that could actually validate the PEA model.

      As the comments are repetitive at some places, we summarize them into three questions and address it one by one below:

      (1) Methodological Concerns about visual uncertainty estimation in Experiment 1: a) the visual uncertainty is measured in movement angles (degrees), while the unit in vision science is in visual angles (vda). This mismatch of unit hinders direct comparison between the found visual uncertainty and those reported in the literature, and b) a 1-second delay between movement endpoint and the reference marker presentation causes an overestimate of visual uncertainty due to potential degradation of visual memory. c) The linear function of visual uncertainty is a result of having only three perturbation sizes.

      a) As noted by the reviewer, our visual uncertainty is about cursor motion direction in the display plane, which has never been measured before. We do not think our data is comparable to any findings in visual science about fovea/peripheral comparison. We quoted Klein and others’ work Klein & Levi, 1987; Levi et al., 1987 in vision science since their studies showed that the deviation from the fixation is associated with the increase in visual uncertainty. Their study thus inspired our Exp1 to probe how our concerned visual uncertainty (specifically for visual motion direction) changes with an increasing deviation from the fixation. We believe that any model and its model parameters should be specifically tailored to the task or context it tries to emulate. In our case, motion direction in a center-out reaching setting is the modeled context, and all the relevant model parameters should be specified in movement angles.

      b) The 1s delay of the reference cursor appears to have minimum impact on the estimate of visual uncertainty, based on previous vision studies. Our Exp1 used a similar visual paradigm by White et al., 1992, which shows that delay does not lead to an increase in visual uncertainty over a broad range of values (from 0.2s to >1s, see their Figure 5-6). We will add more methodology justifications in our revision.

      c) We agree that if more angles are tested we can be more confident about the linearity of visual uncertainty. However, the linear function is a good approximation of visual uncertainty (as shown in Figure 2C). More importantly, our model performance does not hinge on a strict linear function. Say, if it is a power function with an increasing slope, our model will still predict the major findings presented in the paper, as correctly pointed out by the reviewer. It is the increasing trend of visual uncertainty, which is completely overlooked by previous studies, that lead to various seemingly puzzling findings in implicit adaptation. Lastly, without assuming a linear function, we fitted the large dataset of motor adaptation from Exp2 to numerically estimate the visual uncertainty. This estimated visual uncertainty has a strong linear relationship with perturbation size (R = 0.991, p<0.001). In fact, the model-fitted visual uncertainty is very close to the values we obtained in Exp1. We now included this analysis in the revision. See details in Supplementary text 2 and Figure S7.

      (2) Experiment 3's: the reviewer argues that the Tsay et al., 2020 data does not accurately measure proprioceptive recalibration, thus it is not suitable for showing our model’s capacity in explaining proprioceptive changes during adaptation.

      Response: We agree that the data from Tsay et al., 2020 is not from passive localization, which is regarded as the widely-accepted method to measure proprioceptive recalibration, a recalibration effect in the sensory domain. The active localization, as used in Tsay et al., 2020, is hypothesized as closely related to people’s forward prediction (where people want to go as the reviewer put it in the comments). However, we want to emphasize that we never equated Tsay’s findings as proprioceptive recalibration: throughout the paper we call them “reported hand location”. We reserved “proprioceptive recalibration” to our own Exp3, which used a passive localization method. Thus, we are not guilty of using this term. Secondly, as far as we know, localization bias or changes, no matter measured by passive or active methods, have not been formally modeled quantitatively. We believe our model can explain both, at least in the error-clamp adaptation setting here. Exp3 is for passive localization, the proprioceptive bias is caused by the biasing effect from the just-perceived hand location (X_hand_hat) from the adaptation trial. Tsay et al. 2020 data is for active localization, whose bias shows a characteristic change from negative to positive. This can be explained by just-perceived hand location (X_hand_hat again) and a gradually-adapting hand (X_p). We think this is a significant advance in the realm of proprioceptive changes in adaptation. Of course, our idea can be further tested in other task conditions, e.g., conventional visuomotor rotation or even gain adaptation, which should be left for future studies.

      For technical concerns, Tsay et al., 2020 data set is not ideal: when reporting hand location, the participants view the reporting wheel as well as the original target. As correctly pointed out by the reviewer, the presence of the target might provide an anchoring cue for perceptual judgment, which acts as an attractor for localization. If it were the case, our cue combination would predict that this extra attractor effect would lead to a smaller proprioceptive effect than that is currently reported in their paper. The initial negative bias will be closer to the target (zero), and the later positive bias will be closer to the target too. However, the main trend will remain, i.e. the reported hand location would still show the characteristic negative-to-positive change. The attractor effect of the target can be readily modeled by giving less weight to the just-perceived hand location (X_hand_hat). Thus, we would like to keep Tsay et al., 2020 data in our paper but add some explanations of the limitations of this dataset as well as how the model would fare with these limitations.

      That being said, our model can explain away both passive and active localization during implicit adaptation elicited by error clamp. The dataset from Tsay et al., 2021 paper is not a good substitute for their 2020 paper in terms of modeling, since that study interleaved some blocks of passive localization trials with adaptation trials. This kind of block design would lead to forgetting of both adaptation (Xp in our model) and the perceived hand (X_hand_hat in our model), the latter is still not considered in our model yet. As our Exp3, which also used passive localization, shows, the influence of the perceived hand on proprioceptive bias is short-lived, up to three trials without adaptation trials. Of course, it would be of great interest to design future studies to study how the proprioceptive bias changes over time, and how its temporal changes relate to the perceptual error. Our model provides a testbed to move forward in this direction.

      (3) The reviewer raises concerns about the study's assumption that participants ignore error feedback, questioning the model's applicability to broader contexts and real-world scenarios where ignoring errors might not be viable or common.

      Reviewer 2 raised the same question above. We moved our responses here. “We appreciate your suggestion to broaden the discussion about the model's applicability beyond the visuomotor rotation paradigm, a point we acknowledge was not sufficiently explored in our initial discussion.

      Our model is not limited to the error-clamp adaptation, where the participants were explicitly told to ignore the rotated cursor. The error-clamp paradigm is one rare example that implicit motor learning can be isolated in a nearly idealistic way. Our findings thus imply two key aspects of implicit adaptation: 1) localizing one’s effector is implicitly processed and continuously used to update the motor plan; 2) Bayesian cue combination is at the core of integrating movement feedback and motor-related cues (motor prediction cue in our model) when forming procedural knowledge for action control.

      We will propose that the same two principles should be applied to various kinds of motor adaptation and motor skill learning, which constitutes motor learning in general. Most of our knowledge about motor adaptation is from visuomotor rotation, prism adaptation, force field adaptation, and saccadic adaptation. The first three types all involve localizing one’s effector under the influence of perturbed sensory feedback, and they also have implicit learning. We believe they can be modeled by variants of our model, or at least should consider using the two principles we laid out above to think of their computational nature. For skill learning, especially for de novo learning, the area still lacks a fundamental computational model that accounts for skill acquisition process on the level of relevant movement cues. Our model suggests a promising route, i.e., repetitive movements with a Bayesian cue combination of movement-related cues might underlie the implicit process of motor skills.”

      We also add one more important implication of our model: as stated above, our model also explains that the proprioceptive changes, revealed by active or passive localization methods, are brought by (mis)perceived hand localization via Bayesian cue combination. This new insight, though only tested here using the error-clamp paradigm, can be further utilized in other domains, e.g., conventional visuomotor rotation or force field adaptation. We hope this serves as an initial endeavor in developing some computational models for proprioception studies. Please see the extended discussion on this matter in the revision.

      Recommendations for the authors:

      Revisions:

      All three reviewers were positive about the work and have provided a set of concrete and well-aligned suggestions, which the authors should address in a revised version of the article. These are listed below.

      A few points of particular note:

      (1) There are a lot of discussions about the actual interpretation of behavioral data from this paper or past papers (Tsay et al. 2020 or 2021) and how it matches the different variables of the model.

      (2) There are some discussions on the results of the first experiment, both in terms of how it is reported (providing degrees of visual angle) and how it is different than previous results (importance of the point of fixation). We suggest also discussing a few papers on eye movements during motor adaptation from the last years (work of Anouk de Brouwer and Opher Donchin). Could the authors also discuss why they found opposite results to that of previous visual uncertainty studies (i.e., visual uncertainty attenuates learning with large, but not small, visual errors); rather than the other way around as in Burge et al and Tsay et al 2021 and Makino Nozaki 2023 (where visual uncertainty attenuates small, but not large, visual errors).

      (3) It is recommended by several reviewers to discuss the applicability of the model to other areas/perturbations.

      (4) Several reviewers and I believe that the impact of the paper would be much higher if the code to reproduce all the simulations of the model is made available to the readers. In addition, while I am very positive about the fact that the authors shared the data of their experiments, metadata seems to be missing while they are highly important because these data are otherwise useless.

      Thank you for the concise summary of the reviewers’ comments. We have addressed their concerns point by point.

      Reviewer #2 (Recommendations For The Authors):

      L142: The linear increase in visual uncertainty should be substantiated by previous research in vision science. Please cite relevant papers and discuss why the linear model is considered reasonable.

      We cited relevant studies in vision science. Their focus is more about eccentricity inflate visual uncertainty, similar to our findings that deviations from the fixation direction inflate visual uncertainty about motion direction.

      We also want to add that our model performance does not hinge on a strict linear function of visual uncertainty. Say, if it is a power function with an increasing slope, our model will still predict the major findings presented in the paper. It is the increasing trend of visual uncertainty, which is completely overlooked by previous studies, that lead to various seemingly puzzling findings in implicit adaptation. Furthermore, without assuming a linear function, we fitted the large dataset of motor adaptation from Exp2 to numerically estimate the visual uncertainty. This estimated visual uncertainty has a strong linear relationship with perturbation size (R = 0.991, p<0.001). In fact, the model-fitted visual uncertainty is very close to the values we obtained in Exp1. We now included this new analysis in the revision. See details in Supplementary text 2 and Figure S7.

      L300: I found it challenging to understand the basis for this conclusion. Additional explanatory support is required.

      We unpacked this concluding sentence as follows:

      “The observed proprioceptive bias is formally modeled as a result of the biasing effect of the perceived hand estimate x_hand_hat. In our mini-block of passive localization, the participants neither actively moved nor received any cursor perturbations for three trials in a row. Thus, the fact that the measured proprioceptive bias is reduced to nearly zero at the third trial suggests that the effect of perceived hand estimate x_hand_hat decays rather rapidly.”

      L331: For the general reader, a visual representation of what the blurring mask looks like would be beneficial.

      Thanks for the nice suggestion. We added pictures of a clear and a blurred cursor in Figure 5D.

      L390: This speculation is intriguing. It would be helpful if the authors explained why they consider causal inference to operate at an explicit process level, as the reasoning is not clear here, although the idea seems plausible.

      Indeed, our tentative conclusion here is only based on the model comparison results here. It is still possible that causal inference also work for implicit adaptation besides explicit adaptation. We make a more modest conclusion in the revision:

      “The casual inference model is also based on Bayesian principle, then why does it fail to account for the implicit adaptation? We postulate that the failure of the causal inference model is due to its neglect of visual uncertainty as a function of perturbation size, as we revealed in Experiment 1. In fact, previous studies that advocating the Bayesian principle in motor adaptation have largely focused on experimentally manipulating sensory cue uncertainty to observe its effects on adaptation (Burge et al., 2008; He et al., 2016; Körding & Wolpert, 2004; Wei & Körding, 2010), similar to our Experiment 4. Our findings suggest that causal inference of perturbation alone, without incorporating visual uncertainty, cannot fully account for the diverse findings in implicit adaptation. The increase in visual uncertainty by perturbation size is substantial: our Experiment 1 yielded an approximate seven-fold increase from a 4° perturbation to a 64° perturbation. We have attributed this to the fact that people fixate in the desired movement direction during movements. Interestingly, even for conventional visuomotor rotation paradigm where people are required to “control” the perturbed cursor, their fixation is also on the desired direction, not on the cursor itself (de Brouwer, Albaghdadi, et al., 2018; de Brouwer, Gallivan, et al., 2018). Thus, we postulate that a similar hike in visual uncertainty in other “free-viewing” perturbation paradigms. Future studies are warranted to extend our PEA model to account for implicit adaptation in other perturbation paradigms.”

      L789: The method of estimating Sigma_hand in the brain was unclear. Since Bayesian computation relies on the magnitude of noise, the cognitive system must have estimates of this noise. While vision and proprioception noise might be directly inferred from signals, the noise of the hand could be deduced from the integration of these observations or an internal model estimate. This process of estimating noise magnitude is theorized in recursive Bayesian integration models (or Kalman filtering), where the size estimate of the state noise (sigma_hand) is updated concurrently with the state estimate (x_hand hat). The equation in L789 and the subsequent explanation appear to assume a static model of noise estimation. However, in practice, the noise parameters, including Sigma_hand, are likely dynamic and updated with each new observation. A more detailed explanation of how Sigma_hand is estimated and its role in the cognitive process.

      This is a great comment. In fact, if a Kalman filter is used, the learning rate and the state noise all should be dynamically updated on each trial, under the influence of the observed (x_v). In fact, most adaptation models assume a constant learning rate, including our model here. But a dynamic learning rate (B in our model) is something worth trying. However, in our error-clamp setting, x_v is a constant, thus this observation variable cannot dynamically update the Kalman filter; that’s why we opt to use a “static” Bayesian model to explain our datasets. Thus, Sigma_hand can be estimated by using Bayesian principles as a function of three cues available, i.e., the proprioceptive cue, the visual cue, and the motor prediction cue. We added a

      detailed derivation of sigma_hand in the revision in Supplementary text 1.

      Reviewer #3 (Recommendations For The Authors):

      We observed values in Fig 2C for the 64-degree perturbation that seem to be outliers, i.e., greater than 50 degrees. It is unclear how a psychometric curve could have a "slope" or JNP of over 60, especially considering that the tested range was only 60. Since the data plotted in panel C is a collapse of the signed data in panel B, it is perplexing how such large data points were derived, particularly when the signed uncertainty values do not appear to exceed 30.

      Related to the previous point, we would also recommend connecting individual data points: if the uncertainty increases (linearly or otherwise), then people with low uncertainty at the middle distance should also have low uncertainty at the high distance, and people with high uncertainty at one point, should also have that at other distances. Or perhaps the best way to go about this is to use the uncertainty at the two smaller perturbations to predict uncertainty at the largest perturbation for each participant individually?

      Thank you for your suggestion to examine the consistency of individual levels of visual uncertainty across perturbation sizes. First, a sigma_v of 60 degrees is well possible, naturally falling out of the experimental data. It shows some individuals indeed have large visual uncertainty. Given these potential outliers (which should not be readily removed as we don’t have any reason to do so), we estimated the linear function of sigma_v with a robust method, i.e., the GLM with a gamma distribution, which favors right-skewed distribution that can well capture positive outliers. Furthermore, we added in our revision a verification test of our estimates of sigma_v: we used Exp2’s adaptation data to estimate sigma_v without assuming its linear dependency. As shown, the model-fitted sigma_v closely matched the estimated ones from Exp1 (see Supplementary text 2 and Figure S7).

      We re-plotted the sigma_v with connected data points provided, and the data clearly indicate that individuals exhibit consistent levels of visual uncertainty across different perturbation sizes, i.e. those with relatively lower uncertainty at middle distances (in fact, angles) tend to exhibit relatively lower uncertainty at higher distances too, and similarly, those with higher uncertainty at one distance maintain that level of uncertainty at other distances. This is confirmed by spearman correlation analysis to assess the consistency of uncertainties across different degrees of perturbation among individuals. Again, we observed significant correlations between perturbation angles, indicating good individual consistency (4 and 16 degrees, rho = 0.759, p<0.001; 16 and 64 degrees, rho = 0.527, p = 0.026).

      Author response image 4.

      The illustration in Fig 2A does not seem to show a stimulus that is actually used in the experiment (looks like about -30{degree sign} perturbation). It would be good to show all possible endpoints with all other visual elements to scale - including the start-points of the PEST procedure.

      Thanks for the suggestion. We updated Fig 2A to show a stimulus of +16 degree, as well as added an additional panel to show all the possible endpoints.

      Finally (related to the previous point), in lines 589-591 it says the target is a blue cross. Then in lines 614-616, it says participants are to fixate the blue cross or the start position. The start position was supposed to have disappeared, so perhaps the blue plus moved to the start position (which could be the case, when looking at the bottom panel in Fig 2A, although in the illustration the plus did not move fully to the start position, just toward it to some degree). Perhaps the descriptions need to be clarified, or it should be explained why people had to make an eye movement before giving their judgments. And if people could have made either 1) no eye movement, but stayed at fixation, 2) moved to the blue plus as shown in the last panel in Fig 2A, or 3) fixated on the home position, we'd be curious to know if this affected participants' judgments.

      Thanks for pointing that out. The blue cross serves as the target in the movement task, then disappears with the cursor after 800ms of frozen time. The blue cross then appeared in the discrimination task at the center of the screen, i.e. the start location. Subjects were asked to fixate at the blue cross during the visual discrimination task. Note this return the fixation to the home position is exactly what we will see in typical error-clamp adaptation: once the movement is over, people guided their hand back to the home position. We performed a pilot study to record the typical fixation pattern during error-clamp adaptation, and Exp1 was intentionally designed to mimic its fixation sequence. We have now updated the description of Figure 2A, emphasizing the stimulus sequence. .

      In Figure 4A, the label "bias" is confusing as that is used for recalibrated proprioceptive sense of hand position as well as other kinds of biases elsewhere in the paper. What seems to be meant is the integrated hand position (x-hat_hand?) where all three signals are apparently combined. The label should be changed and/or it should be clarified in the caption.

      Thanks for pointing that out, it should be x_hand_hat, and we have corrected this in the revised version of Figure 4.

      In the introduction, it is claimed that larger perturbations have not been tested with "implicit adaptation" paradigms, but in the same sentence, a paper is cited (Moorehead et al., 2017) that tests a rotation on the same order of magnitude as the largest one tested here (95{degree sign}), as well as much larger rotations (135{degree sign} and 175{degree sign}). With error-clamps. Interestingly, there is no adaptation in those conditions, which seems more in line with the sensory cue integration model. Can the PEA model explain these results as well? If so, this should be included in the paper, and if not, it should be discussed as a limitation.

      First, we double checked our manuscript and found that we never claimed that larger perturbations had not been tested.

      We agree that it is always good to have as many conditions as possible. However, the 135 and 175 degree conditions would lead to minimum adaptation, which would not help much in terms of model testing. We postulated that this lack of adaptation is simply due to the fact that people cannot see the moving cursor, or some other unknown reasons. Our simple model is not designed to cover those kinds of extreme cases.

      Specify the size of the arc used for the proprioceptive tests in Exp 3 and describe the starting location of the indicator (controlled by the left hand). Ideally, the starting location should have varied across trials to avoid systematic bias.

      Thank you for the comments. The size of the arc used during these tests, as detailed in the methods section of our paper, features a ring with a 10 cm radius centered at the start position. This setup is visually represented as a red arc in Figure 7B.

      After completing each proprioceptive test trial, participants were instructed to position the indicator at approximately -180° on the arc and then relax their left arm. Although the starting location for the subsequent trial remained at-180°, it was not identical for every trial, thereby introducing slight variability.

      Please confirm that the proprioceptive biases plotted in Fig 4E are relative to the baseline.

      Thank you for bringing this to our attention. Yes, the proprioceptive biases illustrated in Figure 4E are indeed calculated relative to the baseline measurements. We have added this in the method part.

      Data availability: the data are available online, but there are some ways this can be improved. First, it would be better to use an open data format, instead of the closed, proprietary format currently used. Second, there is no explanation for what's in the data, other than the labels. (What are the units? What preprocessing was done?) Third, no code is made available, which would be useful for a computational model. Although rewriting the analyses in a non-proprietary language (to increase accessibility) is not a reasonable request at this point in the project, I'd encourage it for future projects. But perhaps Python, R, or Julia code that implements the model could be made available as a notebook of sorts so that other labs could look at (build on) the model starting with correct code - increasing the potential impact of this work.

      Great suggestions. We are also fully supportive of open data and open science. We now:

      (1) Updated our data and code repository to include the experimental data in an open data format (.csv) for broader accessibility.

      (2) The data are now accompanied by detailed descriptions to clarify their contents.

      (3) We have made the original MATLAB (.m) codes for data analysis, model fitting and simulation available online.

      (4) We also provide the codes in Jupyter Notebook (.ipynb) formats.

      These updates can be found in the revised “Data Availability” section of our manuscript.

      References

      Bromberg, Z., Donchin, O., & Haar, S. (2019). Eye Movements during Visuomotor Adaptation Represent Only Part of the Explicit Learning. eNeuro, 6(6). https://doi.org/10.1523/ENEURO.0308-19.2019

      Burge, J., Ernst, M. O., & Banks, M. S. (2008). The statistical determinants of adaptation rate in human reaching. Journal of Vision, 8(4), 1–19.

      de Brouwer, A. J., Gallivan, J. P., & Flanagan, J. R. (2018). Visuomotor feedback gains are modulated by gaze position. Journal of Neurophysiology, 120(5), 2522–2531.

      Egly, R., & Homa, D. (1984). Sensitization of the visual field. Journal of Experimental Psychology. Human Perception and Performance, 10(6), 778–793.

      Kim, H. E., Parvin, D. E., & Ivry, R. B. (2019). The influence of task outcome on implicit motor learning. eLife, 8. https://doi.org/10.7554/eLife.39882

      Klein, S. A., & Levi, D. M. (1987). Position sense of the peripheral retina. JOSA A, 4(8), 1543–1553.

      Levi, D. M., Klein, S. A., & Yap, Y. L. (1987). Positional uncertainty in peripheral and amblyopic vision. Vision Research, 27(4), 581–597.

      Makino, Y., Hayashi, T., & Nozaki, D. (2023). Divisively normalized neuronal processing of uncertain visual feedback for visuomotor learning. Communications Biology, 6(1), 1286.

      Owsley, C., Ball, K., & Keeton, D. M. (1995). Relationship between visual sensitivity and target localization in older adults. Vision Research, 35(4), 579–587.

      Simani, M. C., McGuire, L. M. M., & Sabes, P. N. (2007). Visual-shift adaptation is composed of separable sensory and task-dependent effects. Journal of Neurophysiology, 98(5), 2827–2841.

      Tsay, J. S., Avraham, G., Kim, H. E., Parvin, D. E., Wang, Z., & Ivry, R. B. (2021). The effect of visual uncertainty on implicit motor adaptation. Journal of Neurophysiology, 125(1), 12–22.

      Tsay, J. S., Chandy, A. M., Chua, R., Miall, R. C., Cole, J., Farnè, A., Ivry, R. B., & Sarlegna, F. R. (2024). Minimal impact of proprioceptive loss on implicit sensorimotor adaptation and perceived movement outcome. bioRxiv : The Preprint Server for Biology. https://doi.org/10.1101/2023.01.19.524726

      Tsay, J. S., Kim, H., Haith, A. M., & Ivry, R. B. (2022). Understanding implicit sensorimotor adaptation as a process of proprioceptive re-alignment. eLife, 11, e76639.

      Wei, K., Stevenson, I. H., & Körding, K. P. (2010). The uncertainty associated with visual flow fields and their influence on postural sway: Weber’s law suffices to explain the nonlinearity of vection. Journal of Vision, 10(14), 4.

      White, J. M., Levi, D. M., & Aitsebaomo, A. P. (1992). Spatial localization without visual references. Vision Research, 32(3), 513–526.

    1. eLife assessment

      This important study provides insights into how the brain constructs categorical neural representations during a difficult auditory target detection task. Through recordings of simultaneous single-unit activity in primary and secondary auditory areas, compelling evidence is provided that categorical neural representations emerge in a secondary auditory area, i.e., PEG. The study is of interest to neuroscientists and can also potentially shed light on human psychological studies.

    2. Reviewer #1 (Public Review):

      This is a very interesting paper which addresses how auditory cortex represents sound while an animal is performing an auditory task. The study involves psychometric and neurophysiological measurements from ferrets engaged in a challenging tone in noise discrimination task, and relates these measurements using neurometric analysis. A novel neural decoding technique (decoding-based dimensionality reduction or dDR, introduced in a previous paper by two of the authors) is used to reduce bias so that stimulus parameters can be read out from neuronal responses.

      The central finding of the study is that, when an animal is engaged in a task, non-primary auditory cortex represents task-relevant sound features in a categorical way. In primary cortex, task engagement also affects representations, but in a different way - the decoding is improved (suggesting that representations have been enhanced), but is not categorical in nature. The authors argue that these results are compatible with a model where early sensory representations form an overcomplete representation of the world, and downstream neurons flexibly read out behaviourally relevant information from these representations.

      I find the concept and execution of the study very interesting and elegant. The paper is also commendably clear and readable. The differences between primary and higher cortex are compelling and I am largely convinced by the authors' claim that they have found evidence that broadly supports a mixed selectivity model of neural disentanglement along the lines of Rigotti et al (2013). I think that the increasing body of evidence for these kinds of representations is a significant development in our understanding of higher sensory representations. I also think that the dDR method is likely to be useful to researchers in a variety of fields who are looking to perform similar types of neural decoding analysis.

    3. Reviewer #2 (Public Review):

      This study compares the activity of neural populations in the primary and non-primary auditory cortex of ferrets while the animals actively behaved or passively listened to a sound discrimination task. Using a variety of methods, the authors convincingly show differential effects of task engagement on population neural activity in primary vs non-primary auditory cortex; notably that in the primary auditory cortex, task-engagement (1) improves discriminability for both task-relevant and non-task relevant dimensions, and (2) improves the alignment between covariability and sound discrimination axes; whereas in the non-primary auditory cortex, task-engagement (1) improves discriminability for only task-relevant dimensions, and (2) does not affect the alignment between covariability and sound discrimination axes. They additionally show that task-engagement changes in gain can account for the selectivity noted in the discriminability of non-primary auditory neurons. They also admirably attempt to isolate task-engagement from arousal fluctuations, by using fluctuations in pupil size as a proxy for physiological arousal. This is a well-carried out study with thoughtful analyses which in large part achieves its aims to evaluate how task-engagement changes neural activity across multiple auditory regions . As with all work, there are several caveats or areas for future study/analysis. First, the sounds used here (tones, and narrow-band noise) are relatively simple sounds; previous work suggests that exactly what activity is observed within each region (e.g., sensory only, decision-related, etc) may depend in part upon what stimuli are used. Therefore, while the current study adds importance to the literature, future work may consider the use of more varied stimuli. Second, the animals here were engaged in a behavioral task; but apart from an initial calculation of behavioral d', the task performance (and its effect on neural activity) is largely unaddressed.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      …I find the concept and execution of the study very interesting and elegant. The paper is also commendably clear and readable. The differences between primary and higher cortex are compelling and I am largely convinced by the authors' claim that they have found evidence that broadly supports a mixed selectivity model of neural disentanglement along the lines of Rigotti et al (2013). I think that the increasing body of evidence for these kinds of representations is a significant development in our understanding of higher sensory representations. I also think that the dDR method is likely to be useful to researchers in a variety of fields who are looking to perform similar types of neural decoding analysis.

      Thanks! We agree that questions around population coding and high-level representations are critical in the field of sensory systems.

      Reviewer #2 (Public Review):

      ... This is a well-carried out study with thoughtful analyses which in large part achieves its aims to evaluate how task-engagement changes neural activity across multiple auditory regions. As with all work, there are several caveats or areas for future study/analysis. First, the sounds used here (tones, and narrow-band noise) are relatively simple sounds; previous work suggests that exactly what activity is observed within each region (e.g., sensory only, decision-related, etc) may depend in part upon what stimuli are used. Therefore, while the current study adds importantly to the literature, future work may consider the use of more varied stimuli. Second, the animals here were engaged in a behavioral task; but apart from an initial calculation of behavioral d', the task performance (and its effect on neural activity) is largely unaddressed.

      The reviewer makes several important points that we hope we addressed in the specific changes detailed below. Indeed, it is important to recognize the possibility that the specific stimuli involved in a task may interact with the effects of behavioral state and that variability in task performance should be considered as an important aspect of behavioral state.

      Reviewer #1 (Recommendations For The Authors):

      I have a few minor comments and criticisms:

      (1) Figure 1c. The choice of low-contrast grey text (e.g. "Target vs. target" is unfortunate, especially when printed, and should be replaced (e.g. with dark grey).

      We have edited the figure to use a higher contrast (dark grey). Thanks for catching this.

      (2) Figure 2 and Supplementary Figure 3. I think some indication of error or significance is required in all panels. Without this, it's hard to interpret any of these panels.

      Thank you for this feedback. Including significance here was clarifying and helps to strengthen our claim that state-dependent changes in neural activity were smaller and more diverse for single neurons than at the population level. We modified Figure 2b-c to indicate whether each neuron’s response to the target stimulus was significantly different than its response to the catch stimulus. The same test was performed in Supplementary Figure 3. Additionally, we added a statistical test in Figure 2d-e to indicate, for each pair of target/catch stimuli, whether discrimination (d-prime) changed significantly between active and passive conditions. Furthermore, we modified the text of the second paragraph under the results heading: “Diverse effects of task engagement on single neurons in primary and non-primary auditory cortex” to reference and interpret the results of these significance tests. The new text reads as follows (L. 121):

      “Sound-evoked spiking activity was compared between active and passive states to study the impact of task engagement on sound representation. In both A1 and dPEG, responses to target and catch stimuli were significantly discriminable for a subset of single neurons (about 25% in both areas, Figure 2A-C, Supplemental Figures 3-5, bootstrap test). This supports the idea that stimulus identity can be decoded in both brain regions, regardless of task performance. However, the fact that the responses of most neurons in both brain areas could not significantly discriminate target vs. catch stimuli also highlights the diversity of sound encoding observed at the level of single neurons. The accuracy of catch vs. target discrimination for each neuron was quantified using neural d-prime, the z-scored difference in target minus catch spiking response for each neuron (Methods: Single neuron PSTHs and d-prime (Niwa et al., 2012a)). Task engagement was associated with significant changes in catch vs. target d-prime for roughly 10% of neurons in both A1 (40 / 481 neurons, bootstrap test) and dPEG (33 / 377 neurons, bootstrap test). This included neurons that both increased their discriminability and decreased their discriminability (Figure 2D-E). Thus, the effects of task engagement at the level of single neurons were relatively mild and inconsistent across the population; many neurons showed no significant change and of those that did, effects were bidirectional (Figure 2D-E).”

      We also included an additional methods paragraph in the “Statistical tests” section to describe the bootstrapping procedure used for these significance tests (L. 644):

      “The one exception to this general approach is in Figure 2, where we analyzed the sound discrimination abilities of single neurons. In this case, we computed p-values for each neuron and stimulus independently. First, for each neuron and catch vs. target stimulus pair, we measured d-prime (see Methods: Single neuron evoked activity and d-prime). We generated a null distribution of d-prime values for each neuron-stimulus pair, under each experimental condition by shuffling stimulus identity across trials before computing d-prime (100 resamples). A neuron was determined to have a significant d-prime for a given target vs. catch pair if its actual measured d-prime was greater than the 95th percentile of the null d-prime distribution. Second, for each neuron and catch vs. target stimulus pair, we tested if d-prime was significantly different between active and passive conditions. To test this, we followed a similar procedure as above, however, rather than shuffle stimulus identity, we shuffled active vs. passive trial labels. This allowed us to generate a null distribution of active vs. passive d-prime difference for each neuron and stimulus pair. A neuron was determined to have a significant change in d-prime between conditions if the actual Δ d-prime lay outside the 95% confidence interval of the null Δ d-prime distribution.”

      For Figure 2a, we chose not to indicate significance on the figure to avoid clutter, since the significance for all neurons in the population are shown in panels b-c anyway. Additionally, the difference plot shown in panel a is in units of z-scores, which we believe already gives a raw sense of the significance of the target vs. catch response change per neuron in this example dataset.

      (3) Figure 2 and Supplementary Figure 3. I would consider including some more examples as a Supplementary Figure (and perhaps combining Supp Fig 3 with Fig 2 as a main figure).

      We found no significant or apparent difference in single-neuron properties between A1 and dPEG. Therefore, we decided it is not helpful to plot both A1 and PEG examples in the main text. However, we agree that the ability to see more examples of the raw data could be useful. Therefore, we compiled two supplementary figures (Supplementary Figures 4 and 5) that replicate Figure 2a for all datasets, encompassing A1 and PEG.

      (4) Figure 2a and Supp Fig 3a. I was initially confused that the "delta-spk/sec (z-score)" values had themselves been z-scored, but now I think that they are simply the differences of the two left hand sub-panels. This could be made clear in the figure legend.

      The figure legends have been modified to state the procedure for computing “delta-spk/sec” more clearly. Specifically, we added the following information to the legend (L. 141):

      “Difference is computed as the z-scored response to the target minus the z-scored catch response (resulting in a difference shown in units of z-score).”

      (5) Figure 2b-e and Supp Fig 3b-e. Indicate the time window over which the responses were measured, and the number of neurons.

      Figure legends have been modified to include a sentence clearly stating the time window over which responses were measured. The number of neurons is also now included in the legend and on the figure itself. Furthermore, a brief description of the new statistical testing procedure has been added here (L. 144).

      “Responses were defined as the total number of spikes recorded during the 300 ms of sound presentation (area between dashed lines in panel A). Neurons with a significantly different response to the catch vs. target stimulus are indicated in black and quantified on the respective figure panel.”

      (6) Figure 2. "singe" should read "single"

      Typo in figure label has been fixed.

      (7) Line 144. Figure number is missing (Figure 3B-C).

      The missing figure number has been added to the text.

      (8) Figure 3. Again, the low-contrast grey should be replaced.

      The low-contrast grey has been replaced with dark grey.

      Reviewer #2 (Recommendations For The Authors):

      This study really nicely compares the activity and effects on activity in two areas of the auditory cortex in respect to task-engagement; I think it is, for the most part, very well done.

      A couple of specific recommendations:

      (1) Although I understand 'inf dB' as the SNR, including the actual dB level used in the experiments, would be useful, especially in the case of the inf dB.

      Thank you for this feedback. We agree that clarification about the overall sound level used here would be helpful. We have modified the methods section “Behavioral paradigm” to include the following sentence (L. 450):

      “That is, the masking noise (and distractor stimuli) were always presented with an overall sound level of 60 dB SPL. Infinite (inf) dB trials corresponded to trials where the target tone was presented at 60 dB SPL without any masking noise present, 0 dB to trials where the target was 60 dB SPL, -5 dB to trials where the target was presented at 55 dB SPL etc.”

      In addition, we have modified the main text (L. 82):

      “Animals reported the occurrence of a target tone in a sequence of narrowband noise distractors by licking a piezo spout (Figure 1A, Methods: Behavioral paradigm, distractor stimulus sound level: 60 dB SPL). … We describe SNR as the overall SPL of the target relative to distractor noise level. Thus, an SNR of –5 dB corresponds to a target level of 55 dB SPL while an Inf dB SNR corresponds to a target tone presented without any masking noise.”

      And Figure legend 1 now explicitly states the sound level used in the experiments (L. 104):

      “Variable SNR was achieved by varying overall SPL of the target relative to the fixed (60 dB SPL) distractor noise, e.g., -5 dB SNR corresponds to a 55 dB SPL target with 60 dB SPL masking noise. Infinite (inf) dB SNR corresponds to a target tone presented in isolation (60 dB SPL).”

      (2) I very much appreciate the attempt to disentangle task engagement from generalized arousal state, and specifically, addressing this through the use of pupillometry. However, by focusing the discussion of pupil dynamics solely on the arousal-state aspects of pupil size, the paper doesn't address the increasing evidence suggests that pupil size may fluctuate based upon a lot of other things, including perceptual events (see Kronemer et al, 2022 for a recent human paper; for auditory: Zekveld et al 2018 (review) and Montes-Lourido et al, 2021; but many many others, too). It would be nice to see either a bit more nuanced discussion of what pupil size may be indicating (easier), or analyzing the behavior in the context of pupil dynamics (a heavier lift).

      This is a good point. We agree that it is worth mentioning these more nuanced aspects of cognition that may be reflected by pupil size. Therefore, we also analyzed pupil size in the context of behavioral performance (see Supplemental Figure 6) and added the following text to the results (L. 193).

      “In addition to reflecting overall arousal level, pupil size has also been reported to reflect more nuanced cognitive variables such as, for example, listening effort (Zekveld et al., 2014). Furthermore, rodent data suggests that optimal sensory detection is associated with intermediate pupil size (McGinley et al., 2015), consistent with the hypothesis of an inverted-U relationship between arousal and behavioral performance (Zekveld et al., 2014). To determine if this pattern was true for the animals in our task, we measured the dynamics of pupil size in the context of behavioral performance. Across animals, task stimuli evoked robust pupil dilation that varied with trial outcome (Supplemental Figure 6b-c). Notably, pre-trial pupil size was significantly different between correct (hit and correct reject), hit, and miss trials (Supplemental Figure 6b-c), recapitulating the finding of an inverted-U relationship to performance in rodents (McGinley et al., 2015).  Since we focused only on correct trials in our decoding analysis, these outcome-dependent differences in pupil size are unlikely to contribute to the emergent decoding selectivity in dPEG.”

      (3) I think it would make this paper shine that much more if behavioral performance were not subsumed into the overall label of task engagement. You've already established you have performance that varies as a function of SNR; I would love to see the neural d' and covariability related to the behavioral d' (in the comparisons where this is possible). I would also love to see a more direct measure of choice for those stimuli that show variable behavior (e.g., a choice probability analysis or something of the like would seem to be easily applied to the target SNRs of -5 and 0 dB); and compare task engaged activity of hits vs misses vs passive listening to those same stimuli. You discuss previous studies looking at choice-related/decision-related activity and draw parallels to this work-given that there is the opportunity with this data set to *directly* assess choice-related activity, the absence of such an analysis seems like a missed opportunity.

      Thank you for this feedback. We agree that “task engagement” is not a unimodal state and that a more fine-grained analysis of task-engaged neural activity, according to behavioral choice, could be informative.

      First, we would like to point out that in Figure 4 we did already compare behavioral d’ to delta neural d’. We found that the two were significantly correlated in dPEG, but not in A1. This suggests that task-dependent changes in stimulus decoding in dPEG, but not A1, are predictive of behavioral performance. This is consistent with the finding that task-relevant stimulus representations were selectively enhanced in dPEG, but not in A1.

      Second, we added a choice decoding analysis to address whether auditory cortex represents the animal’s choice in our task. The results of this analysis are summarized in Supplemental Figure 8 and are discussed under the results section: “Behavioral performance is correlated with neural coding changes in non-primary auditory cortex only.” (L. 226):

      “The previous analysis suggests that the task-dependent increase in stimulus information present in dPEG population activity is predictive of overall task performance. Next, we asked whether the population activity in either brain region was directly predictive of behavioral choice on single hit vs. miss trials. To do this, we conducted a choice probability analysis (Methods). We found that in both brain regions choice could be decoded well above chance level (Supplemental Figure 8). Choice information was present throughout the entire trial and did not increase during the target stimulus presentation. This suggests that the difference in population activity primarily reflects a cognitive state associated with the probability of licking on a given trial, or “impulsivity” rather than “choice.” This interpretation is consistent with our finding that baseline pupil size on each trial is predictive of trial outcome (Supplemental Figure 6b).”

      To keep our decoding approach consistent throughout the manuscript, we followed the same approach for choice decoding as we did for stimulus decoding (perform dDR then calculate neural d-prime in the dimensionality reduced space). To make the results more interpretable, we converted choice d-prime to a choice probability (percent correctly decoded choices) using leave-one-out cross validation. (We note that d-prime and percent correct are very highly correlated statistics.) This is described in the methods as follows (L. 550):

      “We performed a choice decoding analysis on hit vs. miss trials. We followed the same procedure as described above for stimulus decoding, where instead of a pair of stimuli our two classes to be decoded were “hit trial” vs. “miss trial”. That is, for each target stimulus we computed the optimal linear discrimination axis separating hit vs. miss trials (Abbott and Dayan, 1999) in the reduced dimensionality space identified with dDR (Heller and David, 2022). For the sake of interpretability with respect to previous work we reported choice probability as the percentage of correctly decoded trial outcomes rather than d-prime. Percent correct was calculated by projecting the population activity onto the optimal discrimination axis and using leave-one-out cross validation to measure the number of correct classifications.”

      (4) It would also be interesting to look at population coding across sessions (although the point is taken that within a session allows the opportunity to assess covariability). Minorly self-servingly but very much related to the above point, Christison-Lagay et al, 2017 employed a similar detect-in-noise task, analyzed single neurons and population level activity, and looked at putative choice-related activity. The current study has the opportunity to expand on that kind of analysis that much more by looking across multiple sites vs within a given recording site; and compare across regions.

      Thank you for highlighting this point, we agree that it is important. When studying population coding it is critical to consider the impact of covariability between neurons. Therefore, it is worthwhile to revisit our interpretations of prior results, e.g., Christison-Lagay et al, 2017, which studied population coding by combining neurons across different sessions, given that we now have access to simultaneously recorded population data.

      First, we would like to point out that this was the primary motivation for our simulation analyses presented in Figure 5. Using simulations, we found that task-dependent gain modulation (which can be observed across sessions) was sufficient to explain our primary finding – selective enhancement in decoding of behaviorally relevant sound stimuli in dPEG.

      Second, to address the question about how covariability affects choice-related information in auditory cortex and compare our findings with prior studies, we performed the same set of simulations for choice probability analysis. We found that, again, choice-dependent gain modulation was sufficient to explain our findings. That is, simulations with hit- vs. miss-dependent gain changes, but fixed covariability, closely mirrored the choice probability we observed in the raw data. An additional simulation where covariability between all neurons was set to zero also recapitulated our findings in the raw data. Collectively, this suggests that covariability does not play a significant role in shaping the choice information present in A1 and dPEG during this task. We have added the following text to the manuscript to summarize this finding (L. 293):

      “Finally, we used the same simulation approach to determine what aspects of population activity carry the “choice” related information we observed in A1 and dPEG (Figure 4 – figure supplement 1). Similar to our findings for stimulus decoding, we found that gain modulation alone was sufficient to recapitulate the choice information present in the raw data for this task. This helps frame prior work that pooled neurons across sessions to study population coding of choice in similar auditory discrimination tasks (Christison-Lagay et al, 2017).”

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The study introduces and validates the Cyclic Homogeneous Oscillation (CHO) detection method to precisely determine the duration, location, and fundamental frequency of non-sinusoidal neural oscillations. Traditional spectral analysis methods face challenges in distinguishing the fundamental frequency of non-sinusoidal oscillations from their harmonics, leading to potential inaccuracies. The authors implement an underexplored approach, using the auto-correlation structure to identify the characteristic frequency of an oscillation. By combining this strategy with existing time-frequency tools to identify when oscillations occur, the authors strive to solve outstanding challenges involving spurious harmonic peaks detected in time-frequency representations. Empirical tests using electrocorticographic (ECoG) and electroencephalographic (EEG) signals further support the efficacy of CHO in detecting neural oscillations.

      Response:  We thank the reviewer for recognizing the strengths of our method in this encouraging review and for the opportunity to further improve and finalize our manuscript.

      Strengths:

      (1) The paper puts an important emphasis on the 'identity' question of oscillatory identification. The field primarily identifies oscillations through frequency, space (brain region), and time (length, and relative to task or rest). However, more tools that claim to further characterize oscillations by their defining/identifying traits are needed, in addition to data-driven studies about what the identifiable traits of neural oscillations are beyond frequency, location, and time. Such tools are useful for potentially distinguishing between circuit mechanistic generators underlying signals that may not otherwise be distinguished. This paper states this problem well and puts forth a new type of objective for neural signal processing methods.

      Response:  We sincerely appreciate this encouraging summary of the objective of our manuscript.

      (2) The paper uses synthetic data and multimodal recordings at multiple scales to validate the tool, suggesting CHO's robustness and applicability in various real-data scenarios. The figures illustratively demonstrate how CHO works on such synthetic and real examples, depicting in both time and frequency domains. The synthetic data are well-designed, and capable of producing transient oscillatory bursts with non-sinusoidal characteristics within 1/f noise. Using both non-invasive and invasive signals exposes CHO to conditions which may differ in extent and quality of the harmonic signal structure. An interesting followup question is whether the utility demonstrated here holds for MEG signals, as well as source-reconstructed signals from non-invasive recordings.

      Response:  We thank the reviewer for this excellent suggestion.  Indeed, our next paper will focus on applying our CHO method to signals that were source-reconstructed from non-invasive recordings (e.g., MEG and EEG) to extract their periodic activity.

      (3) This study is accompanied by open-source code and data for use by the community.

      Response:  We thank the reviewer for recognizing our effort to widely disseminate our method to the broader community.

      Weaknesses:

      (1) Due to the proliferation of neural signal processing techniques that have been designed to tackle issues such as harmonic activity, transient and event-like oscillations, and non-sinusoidal waveforms, it is naturally difficult for every introduction of a new tool to include exhaustive comparisons of all others. Here, some additional comparisons may be considered for the sake of context, a selection of which follows, biased by the previous exposure of this reviewer. One emerging approach that may be considered is known as state-space models with oscillatory and autoregressive components (Matsuda 2017, Beck 2022). State-space models such as autoregressive models have long been used to estimate the auto-correlation structure of a signal. State-space oscillators have recently been applied to transient oscillations such as sleep spindles (He 2023). Therefore, state-space oscillators extended with auto-regressive components may be able to perform the functions of the present tool through different means by circumventing the need to identify them in time-frequency. Another tool that should be mentioned is called PAPTO (Brady 2022). Although PAPTO does not address harmonics, it detects oscillatory events in the presence of 1/f background activity. Lastly, empirical mode decomposition (EMD) approaches have been studied in the context of neural harmonics and nonsinusoidal activity (Quinn 2021, Fabus 2022). EMD has an intrinsic relationship with extrema finding, in contrast with the present technique. In summary, the existence of methods such as PAPTO shows that researchers are converging on similar approaches to tackle similar problems. The existence of time-domain approaches such as state-space oscillators and EMD indicates that the field of timeseries analysis may yield even more approaches that are conceptually distinct and may theoretically circumvent the methodology of this tool.

      Response:  We thank the reviewer for this valuable insight.  In our manuscript, we acknowledge emerging approaches that employ state-space models or EMD for time-frequency analysis.  However, it's crucial to clarify that the primary focus in our study is on the detection and identification of the fundamental frequency, as well as the onset/offset of non-sinusoidal neural oscillations.  Thus, our emphasis lies specifically on these aspects.  We hope that future studies will use our methods as the basis to develop better methods for time-frequency analysis that will lead to a deeper understanding of harmonic structures.  

      Our Limitation section is addressing this issue.  Specifically, we recognize that a more sophisticated time-frequency analysis could contribute to improved sensitivity and that the core claim of our study is centered around the concept of increasing specificity in the detection of non-sinusoidal oscillations.  We hope that future studies will use this as a basis for improving time-frequency analysis in general.  Notably, our open-source code will greatly enable these future studies in this endeavor.  Specifically, in the first step of our algorithm, the timefrequency estimation can be replaced with any other preferred time-frequency analysis, such as state-space models, EMD, Wavelet transform, Gabor transform, and Matching Pursuit. 

      For our own follow-up study, we plan to conduct a thorough review and comparison of emerging approaches employing state-space models or EMD for time-frequency analysis.  In this study, we aim to identify which approach, including the six methods mentioned by the reviewer (Matsuda 2017, Beck 2022, He 2023, Brady 2022, Quinn 2021, and Fabus 2022), can maximize the estimation of the fundamental frequency of non-sinusoidal neural oscillations using CHO.  The insights provided by the reviewer are appreciated, and we will carefully consider these aspects in our follow-up study.  

      In the revision of this manuscript, we are setting the stage for these future studies.  Specifically, we added a discussion paragraph within the Limitation section about the state-space model, and EMD approaches:

      “However, because our CHO method is modular, the FFT-based time-frequency analysis can be replaced with more sophisticated time-frequency estimation methods to improve the sensitivity of neural oscillation detection.  Specifically, a state-space model (Matsuda 2017, Beck 2022, He 2023, Brady 2022) or empirical mode decomposition (EMD, Quinn 2021, Fabus 2022) may improve the estimation of the auto-correlation of the harmonic structure underlying nonsinusoidal oscillations.  Furthermore, a Gabor transform or matching pursuit-based approach may improve the onset/offset detection of short burst-like neural oscillations (Kus 2013 and Morales 2022).”

      (2) The criteria that the authors use for neural oscillations embody some operating assumptions underlying their characteristics, perhaps informed by immediate use cases intended by the authors (e.g., hippocampal bursts). The extent to which these assumptions hold in all circumstances should be investigated. For instance, the notion of consistent auto-correlation breaks down in scenarios where instantaneous frequency fluctuates significantly at the scale of a few cycles. Imagine an alpha-beta complex without harmonics (Jones 2009). If oscillations change phase position within a timeframe of a few cycles, it would be difficult for a single peak in the auto-correlation structure to elucidate the complex time-varying peak frequency in a dynamic fashion. Likewise, it is unclear whether bounding boxes with a pre-specified overlap can capture complexes that maneuver across peak frequencies.

      Response:  We thank the reviewer for this valuable insight into the methodological limitations in the detection of neural oscillations that exhibit significant fluctuations in their instantaneous frequency.  Indeed, our CHO method is also limited in the ability to detect oscillations with fluctuating instantaneous frequencies.  This is because CHO uses an auto-correlation-based approach to detect neural oscillations that exhibit two or more cycles.  If oscillations change phase position within a timeframe of a few cycles, CHO cannot detect the oscillation because the periodicity is not expressed within the auto-correlation.  This limitation can be partially overcome by relaxing the detection threshold (see Line 30 of Algorithm 1 in the revised manuscript) for the auto-correlation analysis.  However, relaxing the detection threshold, in consequence, increases the probability of detecting other aperiodic activity as well. To clarify how CHO determines the periodicity of oscillations, and to educate the reader about the tradeoff between detecting oscillations with fluctuating instantaneous frequencies and avoiding detecting other aperiod activity, we have added pseudo code and a new subsection in the Methods.

      Author response table 1.

      Algorithm 1

      A new subsection titled “Tradeoffs in adjusting the hyper-parameters that govern the detection in CHO”.

      “The ability of CHO to detect neural oscillations and determine their fundamental frequency is governed by four principal hyper-parameters.  Adjusting these parameters requires understanding their effect on the sensitivity and specificity in the detection of neural oscillations. 

      The first hyper-parameter is the number of time windows (N in Line 5 in Algorithm 1), that is used to estimate the 1/f noise.  In our performance assessment of CHO, we used four windows, resulting in estimation periods of 250 ms in duration for each 1/f spectrum.  A higher number of time windows results in smaller estimation periods and thus minimizes the likelihood of observing multiple neural oscillations within this time window, which otherwise could confound the 1/f estimation.  However, a higher number of time windows and, thus, smaller time estimation periods may lead to unstable 1/f estimates. 

      The second hyper-parameter defines the minimum number of cycles of a neural oscillation to be detected by CHO (see Line 23 in Algorithm 1).  In our study, we specified this parameter to be two cycles.  Increasing the number of cycles increases specificity, as it will reject spurious oscillations.  However, increasing the number also reduces sensitivity as it will reject short oscillations.

      The third hyper-parameter is the significance threshold that selects positive peaks within the auto-correlation of the signal.  The magnitude of the peaks in the auto-correlation indicates the periodicity of the oscillations (see Line 26 in Algorithm 1).  Referred to as "NumSTD," this parameter denotes the number of standard errors that a positive peak has to exceed to be selected to be a true oscillation.  For this study, we set the "NumSTD" value to 1.  Increasing the "NumSTD" value increases specificity in the detection as it reduces the detection of spurious peaks in the auto-correlation.  However, increasing the "NumSTD" value also decreases the sensitivity in the detection of neural oscillations with varying instantaneous oscillatory frequencies. 

      The fourth hyper-parameter is the percentage of overlap between two bounding boxes that trigger their merger (see Line 31 in Algorithm 1).  In our study, we set this parameter to 75% overlap.  Increasing this threshold yields more fragmentation in the detection of oscillations, while decreasing this threshold may reduce the accuracy in determining the onset and offset of neural oscillations.”

      (3) Related to the last item, this method appears to lack implementation of statistical inferential techniques for estimating and interpreting auto-correlation and spectral structure. In standard practice, auto-correlation functions and spectral measures can be subjected to statistical inference to establish confidence intervals, often helping to determine the significance of the estimates. Doing so would be useful for expressing the likelihood that an oscillation and its harmonic has the same autocorrelation structure and fundamental frequency, or more robustly identifying harmonic peaks in the presence of spectral noise. Here, the authors appear to use auto-correlation and time-frequency decomposition more as a deterministic tool rather than an inferential one. Overall, an inferential approach would help differentiate between true effects and those that might spuriously occur due to the nature of the data. Ultimately, a more statistically principled approach might estimate harmonic structure in the presence of noise in a unified manner transmitted throughout the methodological steps.

      Response:  We thank the reviewer for sharing this insight on further enhancing our method.  Indeed, CHO does not make use of statistical inferential statistics to estimate and interpret the auto-correlation and underlying spectral structure of the neural oscillation.  Implementing this approach within CHO would require calculating phase-phase coupling across all cross-frequency bands and bounding boxes.  However, as mentioned in the introduction section and Figure 1GL, phase-phase coupling analysis cannot fully ascertain whether the oscillations are phaselocked and thus are harmonics or, indeed, independent oscillations.  This ambiguity, combined with the exorbitant computational complexity of the entailed permutation test and the requirement to perform the analysis across all cross-frequency bands, channels, and trials, makes phase-phase coupling impracticable in determining the fundamental frequency of neural oscillations in real-time and, thus, the use in closed-loop neuromodulation applications.  Thus, within our study, we prioritized determining the fundamental frequency without considering the structure of harmonics.  

      An inferential approach can be implemented by adjusting the significance threshold that selects positive peaks within the auto-correlation of the signal.  Currently, this threshold is set to represent the approximate confidence bounds of the periodicity of the fundamental frequency.  To clarify this issue, we added additional pseudo code and a new subsection, titled “Tradeoffs in adjusting the hyper-parameters that govern the detection in CHO,” in the Methods section.

      In future studies, we will investigate the harmonic structure of neural oscillations based on a large data set.  This exploration will help us understand how non-sinusoidal properties may influence the harmonic structure.  Your input is highly appreciated, and we will diligently incorporate these considerations into our research.

      See Author response table 1.

      A new subsection titled “Tradeoffs in adjusting the hyper-parameters that govern the detection in CHO”.

      “The ability of CHO to detect neural oscillations and determine their fundamental frequency is governed by four principal hyper-parameters.  Adjusting these parameters requires understanding their effect on the sensitivity and specificity in the detection of neural oscillations. 

      The first hyper-parameter is the number of time windows (N in Line 5 in Algorithm 1), that is used to estimate the 1/f noise.  In our performance assessment of CHO, we used four windows, resulting in estimation periods of 250 ms in duration for each 1/f spectrum.  A higher number of time windows results in smaller estimation periods and thus minimizes the likelihood of observing multiple neural oscillations within this time window, which otherwise could confound the 1/f estimation.  However, a higher number of time windows and, thus, smaller time estimation periods may lead to unstable 1/f estimates. 

      The second hyper-parameter defines the minimum number of cycles of a neural oscillation to be detected by CHO (see Line 23 in Algorithm 1).  In our study, we specified this parameter to be two cycles.  Increasing the number of cycles increases specificity, as it will reject spurious oscillations.  However, increasing the number also reduces sensitivity as it will reject short oscillations.

      The third hyper-parameter is the significance threshold that selects positive peaks within the auto-correlation of the signal.  The magnitude of the peaks in the auto-correlation indicates the periodicity of the oscillations (see Line 26 in Algorithm 1).  Referred to as "NumSTD," this parameter denotes the number of standard errors that a positive peak has to exceed to be selected to be a true oscillation.  For this study, we set the "NumSTD" value to 1.  Increasing the "NumSTD" value increases specificity in the detection as it reduces the detection of spurious peaks in the auto-correlation.  However, increasing the "NumSTD" value also decreases the sensitivity in the detection of neural oscillations with varying instantaneous oscillatory frequencies. 

      The fourth hyper-parameter is the percentage of overlap between two bounding boxes that trigger their merger (see Line 31 in Algorithm 1).  In our study, we set this parameter to 75% overlap.  Increasing this threshold yields more fragmentation in the detection of oscillations, while decreasing this threshold may reduce the accuracy in determining the onset and offset of neural oscillations.”

      (4) As with any signal processing method, hyperparameters and their ability to be tuned by the user need to be clearly acknowledged, as they impact the robustness and reproducibility of the method. Here, some of the hyperparameters appear to be: a) number of cycles around which to construct bounding boxes and b) overlap percentage of bounding boxes for grouping. Any others should be highlighted by the authors and clearly explained during the course of tool dissemination to the community, ideally in tutorial format through the Github repository.

      Response:  We thank the reviewer for this helpful suggestion.  In response, we added a new subsection that describes the hyper-parameters of CHO as follows:

      A new subsection named “Tradeoffs in adjusting the hyper-parameters that govern the detection in CHO”.

      “The ability of CHO to detect neural oscillations and determine their fundamental frequency is governed by four principal hyper-parameters.  Adjusting these parameters requires understanding their effect on the sensitivity and specificity in the detection of neural oscillations. 

      The first hyper-parameter is the number of time windows (N in Line 5 in Algorithm 1), that is used to estimate the 1/f noise.  In our performance assessment of CHO, we used four windows, resulting in estimation periods of 250 ms in duration for each 1/f spectrum.  A higher number of time windows results in smaller estimation periods and thus minimizes the likelihood of observing multiple neural oscillations within this time window, which otherwise could confound the 1/f estimation.  However, a higher number of time windows and, thus, smaller time estimation periods may lead to unstable 1/f estimates. 

      The second hyper-parameter defines the minimum number of cycles of a neural oscillation to be detected by CHO (see Line 23 in Algorithm 1).  In our study, we specified this parameter to be two cycles.  Increasing the number of cycles increases specificity, as it will reject spurious oscillations.  However, increasing the number also reduces sensitivity as it will reject short oscillations.

      The third hyper-parameter is the significance threshold that selects positive peaks within the auto-correlation of the signal.  The magnitude of the peaks in the auto-correlation indicates the periodicity of the oscillations (see Line 26 in Algorithm 1).  Referred to as "NumSTD," this parameter denotes the number of standard errors that a positive peak has to exceed to be selected to be a true oscillation.  For this study, we set the "NumSTD" value to 1.  Increasing the "NumSTD" value increases specificity in the detection as it reduces the detection of spurious peaks in the auto-correlation.  However, increasing the "NumSTD" value also decreases the sensitivity in the detection of neural oscillations with varying instantaneous oscillatory frequencies. 

      The fourth hyper-parameter is the percentage of overlap between two bounding boxes that trigger their merger (see Line 31 in Algorithm 1).  In our study, we set this parameter to 75% overlap.  Increasing this threshold yields more fragmentation in the detection of oscillations, while decreasing this threshold may reduce the accuracy in determining the onset and offset of neural oscillations.”

      (5) Most of the validation demonstrations in this paper depict the detection capabilities of CHO. For example, the authors demonstrate how to use this tool to reduce false detection of oscillations made up of harmonic activity and show in simulated examples how CHO performs compared to other methods in detection specificity, sensitivity, and accuracy. However, the detection problem is not the same as the 'identity' problem that the paper originally introduced CHO to solve. That is, detecting a non-sinusoidal oscillation well does not help define or characterize its non-sinusoidal 'fingerprint'. An example problem to set up this question is: if there are multiple oscillations at the same base frequency in a dataset, how can their differing harmonic structure be used to distinguish them from each other? To address this at a minimum, Figure 4 (or a followup to it) should simulate signals at similar levels of detectability with different 'identities' (i.e. different levels and/or manifestations of harmonic structure), and evaluate CHO's potential ability to distinguish or cluster them from each other. Then, does a real-world dataset or neuroscientific problem exist in which a similar sort of exercise can be conducted and validated in some way? If the "what" question is to be sufficiently addressed by this tool, then this type of task should be within the scope of its capabilities, and validation within this scenario should be demonstrated in the paper. This is the most fundamental limitation at the paper's current state.

      Response: Thank you for your insightful suggestion; we truly appreciate it. We recognize that the 'identity' problem requires further studies to develop appropriate methods. Our current approach does not fully address this issue, as it may detect asymmetric non-sinusoidal oscillations with multiple harmonic peaks, without accounting for different shapes of nonsinusoidal oscillations.

      The main reason we could not fully address the “identity” problem results from the general absence of a defined ground truth, i.e., data for which we know the harmonic structure. To overcome this barrier, we would need datasets from well-characterized cognitive tasks or neural disorders.  For example, Cole et al. 2017 showed that the harmonic structure of beta oscillations can explain the degree of Parkinson’s disease, and Hu et al. 2023 showed that the number of harmonic peaks can localize the seizure onset zone. Future studies could use the data from these two studies to study whether CHO can distinguish different harmonic structures of pathological neural oscillations.

      In this paper, we showed the basic identity of neural oscillations, encompassing elements such as the fundamental frequency and onset/offset. Your valuable insights contribute significantly to our ongoing efforts, and we appreciate your thoughtful consideration of these aspects. In response, we added a new paragraph in the Limitation of the discussion section as below:

      “Another limitation of this study is that it does not assess the harmonic structure of neural oscillations. Thus, CHO cannot distinguish between oscillations that have the same fundamental frequency but differ in their non-sinusoidal properties.  This limitation stems from the objective of this study, which is to identify the fundamental frequency of non-sinusoidal neural oscillations.  Overcoming this limitation requires further studies to improve CHO to distinguish between different non-sinusoidal properties of pathological neural oscillations.  The data that is necessary for these further studies could be obtained from the wide range of studies that have linked the harmonic structures in the neural oscillations to various cognitive functions (van Dijk et al., 2010; Schalk, 2015; Mazaheri and Jensen, 2008) and neural disorders (Cole et al., 2017; Jackson et al., 2019; Hu et al., 2023). For example, Cole et al. 2017 showed that a harmonic structure of beta oscillations can explain the degree of Parkinson’s disease, and Hu et al. 2023 showed the number of harmonic peaks can localize the seizure onset zone. “

      References:

      Beck AM, He M, Gutierrez R, Purdon PL. An iterative search algorithm to identify oscillatory dynamics in neurophysiological time series. bioRxiv. 2022. p. 2022.10.30.514422.

      doi:10.1101/2022.10.30.514422

      Brady B, Bardouille T. Periodic/Aperiodic parameterization of transient oscillations (PAPTO)Implications for healthy ageing. Neuroimage. 2022;251: 118974.

      Fabus MS, Woolrich MW, Warnaby CW, Quinn AJ. Understanding Harmonic Structures Through Instantaneous Frequency. IEEE Open J Signal Process. 2022;3: 320-334.

      Jones SR, Pritchett DL, Sikora MA, Stufflebeam SM, Hämäläinen M, Moore CI. Quantitative analysis and biophysically realistic neural modeling of the MEG mu rhythm: rhythmogenesis and modulation of sensory-evoked responses. J Neurophysiol. 2009;102: 3554-3572.

      He M, Das P, Hotan G, Purdon PL. Switching state-space modeling of neural signal dynamics. PLoS Comput Biol. 2023;19: e1011395.

      Matsuda T, Komaki F. Time Series Decomposition into Oscillation Components and Phase Estimation. Neural Comput. 2017;29: 332-367.

      Quinn AJ, Lopes-Dos-Santos V, Huang N, Liang W-K, Juan C-H, Yeh J-R, et al. Within-cycle instantaneous frequency profiles report oscillatory waveform dynamics. J Neurophysiol. 2021;126: 1190-1208.

      Reviewer #2 (Public Review):

      Summary:

      A new toolbox is presented that builds on previous toolboxes to distinguish between real and spurious oscillatory activity, which can be induced by non-sinusoidal waveshapes. Whilst there are many toolboxes that help to distinguish between 1/f noise and oscillations, not many tools are available that help to distinguish true oscillatory activity from spurious oscillatory activity induced in harmonics of the fundamental frequency by non-sinusoidal waveshapes. The authors present a new algorithm which is based on autocorrelation to separate real from spurious oscillatory activity. The algorithm is extensively validated using synthetic (simulated) data, and various empirical datasets from EEG, intracranial EEG in various locations and domains (i.e. auditory cortex, hippocampus, etc.).

      Strengths:

      Distinguishing real from spurious oscillatory activity due to non-sinusoidal waveshapes is an issue that has plagued the field for quite a long time. The presented toolbox addresses this fundamental problem which will be of great use for the community. The paper is written in a very accessible and clear way so that readers less familiar with the intricacies of Fourier transform and signal processing will also be able to follow it. A particular strength is the broad validation of the toolbox, using synthetic, scalp EEG, EcoG, and stereotactic EEG in various locations and paradigms.

      Weaknesses:

      At many parts in the results section critical statistical comparisons are missing (e.g. FOOOF vs CHO). Another weakness concerns the methods part which only superficially describes the algorithm. Finally, a weakness is that the algorithm seems to be quite conservative in identifying oscillatory activity which may render it only useful for analysing very strong oscillatory signals (i.e.

      alpha), but less suitable for weaker oscillatory signals (i.e. gamma).

      Response: We thank Reviewer #2 for the assistance in improving this manuscript.  In the revised manuscript, we have added the missing statistical comparisons, detailed pseudo code, and a subsection that explains the hyper-parameters of CHO.  We also recognize the limitations of CHO in detecting gamma oscillations.  While our results demonstrate beta-band oscillations in ECoG and EEG signals (see Figures 5 and 6), we had no expectation to find gamma-band oscillations during a simple reaction time task.  This is because of the general absence of ECoG electrodes over the occipital cortex, where such gamma-band oscillations may be found. 

      Nevertheless, our CHO method should be able to detect gamma-band oscillations.  This is because if there are gamma-band oscillations, they will be reflected as a bump over the 1/f fit in the power spectrum, and CHO will detect them.  We apologize for not specifying the frequency range of the synthetic non-sinusoidal oscillations.  The gamma band was also included in our simulation. We added the frequency range (1-40 Hz) of the synthetic nonsinusoidal oscillations in the subsection, the caption of Figure 4, and the result section.

      Reviewer #1 (Recommendations For The Authors):

      (1) The example of a sinusoidal neural oscillation in Fig 1 seems to still exhibit a great deal of nonsinusoidal behavior. Although it is largely symmetrical, it has significant peak-trough symmetry as well as sharper peak structure than typical sinusoidal activity. Nevertheless, it has less harmonic structure than the example on the left. A more precisely-stated claim might be that non-sinusoidal behavior is not the distinguishing characteristic between the two, but rather the degree of harmonic structure.

      Response: We are grateful for this thoughtful observation. In response, we now recognize that the depicted example showcases pronounced peak-trough symmetry and sharpness, characteristics that might not be typically associated with sinusoidal behavior. We now better understand that the key differentiator between the examples lies not only in their nonsinusoidal behavior but also in their harmonic structure. To reflect this better understanding, we have refined our manuscript to more accurately articulate the differences in harmonic structure, in accordance with your suggestion. Specifically, we revised the caption of Fig 1 in the manuscript as follows:

      The caption of the Fig 1G-L.

      “We applied the same statistical test to a more sinusoidal neural oscillation (G). Since this neural oscillation more closely resembles a sinusoidal shape, it does not exhibit any prominent harmonic peaks in the alpha and beta bands within the power spectrum (H) and time-frequency domain (I).  Consequently, our test found that the phase of the theta-band and beta-band oscillations were not phase-locked (J-L).  Thus, this statistical test suggests the absence of a harmonic structure.”

      (2) The statement "This suggests that most of the beta oscillations

      detected by conventional methods are simply harmonics of the predominant asymmetric alpha oscillation." is potentially overstated. It is important to constrain this statement to the auditory cortex in which the authors conduct the validation, because true beta still exists elsewhere. The same goes for the beta-gamma claim later on. In general, use of "may be" is also more advisable than the definitive "are".

      Response: We thank the reviewer for this thoughtful feedback. To avoid the potential overstatement of our findings we revised our statement on beta oscillations in the manuscript as follows:

      Discussion:

      “This suggests that most of the beta oscillations detected by conventional methods within auditory cortex may be simply harmonics of the predominant asymmetric alpha oscillation.”

      Reviewer #2 (Recommendations For The Authors):

      All my concerns are medium to minor and I list them as they appear in the manuscript. I do not suggest new experiments or a change in the results, instead I focus on writing issues only.

      a) Line 50: A reference to the seminal paper by Klimesch et al (2007) on alpha oscillations and inhibition would seem appropriate here.

      Response: We added the reference to Klimesch et al. (2007).

      b) Figure 4: It is unclear which length for the simulated oscillations was used to generate the data in panels B-G.

      Response: We generated oscillations that were 2.5 cycles in length and 1-3 seconds in duration. We added this information to the manuscript as follows.

      Figure 4:

      “We evaluated CHO by verifying its specificity, sensitivity, and accuracy in detecting the fundamental frequency of non-sinusoidal oscillatory bursts (2.5 cycles, 1–3 seconds long) convolved with 1/f noise.”

      Results (page 5, lines 163-165):

      “To determine the specificity and sensitivity of CHO in detecting neural oscillations, we applied CHO to synthetic non-sinusoidal oscillatory bursts (2.5 cycles, 1–3 seconds long) convolved with 1/f noise, also known as pink noise, which has a power spectral density that is inversely proportional to the frequency of the signal.”

      Methods (page 20, lines 623-626):

      “While empirical physiological signals are most appropriate for validating our method, they generally lack the necessary ground truth to characterize neural oscillation with sinusoidal or non-sinusoidal properties. To overcome this limitation, we first validated CHO on synthetic nonsinusoidal oscillatory bursts (2.5 cycles, 1–3 seconds long) convolved with 1/f noise to test the performance of the proposed method.”

      c) Figure 5 - supplements: Would be good to re-organize the arrangement of the plots on these figures to facilitate the comparison between Foof and CHO (i.e. by presenting for each participant FOOOF and CHO together).

      Response: We combined Figure 5-supplementary figures 1 and 2 into Figure 5-supplementary figure 1, Figure 6-supplementary figures 1 and 2 into Figure 6-supplementary figure 1, and Figure 8-supplementary figures 1 and 2 into Figure 8-supplementary figure 1. 

      Author response image 1.

      Figure 5-supplementary figure 1:

      Author response image 2.

      Figure 6-supplementary figure 1:

      Author response image 3.

      Figure 8-supplementary figure 1:

      d) Statistics: Almost throughout the results section where the empirical results are described statistical comparisons are missing. For instance, in lines 212-213 the statement that CHO did not detect low gamma while FOOOF did is not backed up by the appropriate statistics. This issue is also evident in all of the following sections (i.e. EEG results, On-offsets of oscillations, SEEG results, Frequency and duration of oscillations). I feel this is probably the most important point that needs to be addressed.

      Response: We added statistical comparisons to Figure 5 (ECoG), 6 (EEG), and the results section as follows.

      Author response image 4.

      Validation of CHO in detecting oscillations in ECoG signals. A. We applied CHO and FOOOF to determine the fundamental frequency of oscillations from ECoG signals recorded during the pre-stimulus period of an auditory reaction time task. FOOOF detected oscillations primarily in the alpha- and beta-band over STG and pre-motor area.  In contrast, CHO also detected alpha-band oscillations primarily within STG, and more focal beta-band oscillations over the pre-motor area, but not STG. B. We investigated the occurrence of each oscillation within defined cerebral regions across eight ECoG subjects. The horizontal bars and horizontal lines represent the median and median absolute deviation (MAD) of oscillations occurring across the eight subjects. An asterisk (*) indicates statistically significant differences in oscillation detection between CHO and FOOOF (Wilcoxon rank-sum test, p<0.05 after Bonferroni correction).”

      Author response image 5.

      Validation of CHO in detecting oscillations in EEG signals. A. We applied CHO and FOOOF to determine the fundamental frequency of oscillations from EEG signals recorded during the pre-stimulus period of an auditory reaction time task.  FOOOF primarily detected alpha-band oscillations over frontal/visual areas and beta-band oscillations across all areas (with a focus on central areas). In contrast, CHO detected alpha-band oscillations primarily within visual areas and detected more focal beta-band oscillations over the pre-motor area, similar to the ECoG results shown in Figure 5. B. We investigated the occurrence of each oscillation within the EEG signals across seven subjects. An asterisk (*) indicates statistically significant differences in oscillation detection between CHO and FOOOF (Wilcoxon rank-sum test, p<0.05 after Bonferroni correction). CHO exhibited lower entropy values of alpha and beta occurrence than FOOOF across 64 channels. C. We compared the performance of FOOO and CHO in detecting oscillation across visual and pre-motor-related EEG channels. CHO detected more alpha and beta oscillations in visual cortex than in pre-motor cortex. FOOOF detected alpha and beta oscillations in visual cortex than in pre-motor cortex.

      We added additional explanations of our statistical results to the “Electrocorticographic (ECoG) results” and “Electroencephalographic (EEG) results” sections.

      “We compared neural oscillation detection rates between CHO and FOOOF across eight ECoG subjects.  We used FreeSurfer to determine the associated cerebral region for each ECoG location. Each subject performed approximately 400 trials of a simple auditory reaction-time task.  We analyzed the neural oscillations during the 1.5-second-long pre-stimulus period within each trial. CHO and FOOOF demonstrated statistically comparable results in the theta and alpha bands despite CHO exhibiting smaller median occurrence rates than FOOOF across eight subjects. Notably, within the beta band, excluding specific regions such as precentral, pars opercularis, and caudal middle frontal areas, CHO's beta oscillation detection rate was significantly lower than that of FOOOF (Wilcoxon rank-sum test, p < 0.05 after Bonferroni correction). This suggests comparable detection rates between CHO and FOOOF in premotor and Broca's areas, while the detection of beta oscillations by FOOOF in other regions, such as the temporal area, may represent harmonics of theta or alpha, as illustrated in Figure 5A and B. Furthermore, FOOOF exhibited a higher sensitivity in detecting delta, theta, and low gamma oscillations overall, although both CHO and FOOOF detected only a limited number of oscillations in these frequency bands.”

      “We assessed the difference in neural oscillation detection performance between CHO and FOOOF across seven EEG subjects.  We used EEG electrode locations according to the 10-10 electrode system and assigned each electrode to the appropriate underlying cortex (e.g., O1 and O2 for the visual cortex). Each subject performed 200 trials of a simple auditory reaction-time task.  We analyzed the neural oscillations during the 1.5-second-long pre-stimulus period. In the alpha band, CHO and FOOOF presented statistically comparable outcomes. However, CHO exhibited a greater alpha detection rate for the visual cortex than for the pre-motor cortex, as shown in Figures 6B and C. The entropy of CHO's alpha oscillation occurrences (3.82) was lower than that of FOOOF (4.15), with a maximal entropy across 64 electrodes of 4.16. Furthermore, in the beta band, CHO's entropy (4.05) was smaller than that of FOOOF (4.15). These findings suggest that CHO may offer a more region-specific oscillation detection than FOOOF.

      As illustrated in Figure 6C, CHO found fewer alpha oscillations in pre-motor cortex (FC2 and FC4) than in occipital cortex (O1 and O2), while FOOOF found more beta oscillations occurrences in pre-motor cortex (FC2 and FC4) than in occipital cortex. However, FOOOF found more alpha and beta oscillations in visual cortex than in pre-motor cortex.

      Consistent with ECoG results, FOOOF demonstrated heightened sensitivity in detecting delta, theta, and low gamma oscillations. 

      Nonetheless, both CHO and FOOOF identified only a limited number of oscillations in delta and theta frequency bands.

      Contrary to the ECoG results, FOOOF found more low gamma oscillations in EEG subjects than in ECoG subjects.”

      e) Line 248: The authors find an oscillatory signal in the hippocampus with a frequency at around 8 Hz, which they refer to as alpha. However, several researchers (including myself) may label this fast theta, according to the previous work showing the presence of fast and slow theta oscillations in the human hippocampus (https://pubmed.ncbi.nlm.nih.gov/21538660/, https://pubmed.ncbi.nlm.nih.gov/32424312/).

      Response: We replaced “alpha” with “fast theta” in the figure and text. We added a citation for Lega et al. 2012.

      f) Line 332: It could also be possible that the auditory alpha rhythms don’t show up in the EEG because a referencing method was used that was not ideal for picking it up. In general, re-referencing is an important preprocessing step that can make the EEG be more susceptible to deep or superficial sources and that should be taken into account when interpreting the data.

      Response: We re-referenced our signals using a common median reference (see Methods section). After close inspection of our results, we found that the EEG topography shown in Figure 6 did not show the auditory alpha oscillation because the alpha power of visual locations greatly exceeded that of those locations that reflect oscillations in the auditory cortex. Further, while our statistical analysis shows that CHO detected auditory alpha oscillations, this analysis also shows that CHO detected significantly more visual alpha oscillations.

      g) Line 463: It seems that the major limitation of the algorithm lies in its low sensitivity which is discussed by the authors. The authors seem to downplay this a bit by saying that the algorithm works just fine at SNRs that are comparable to alpha oscillations. However, alpha is the strongest single in human EEG which may make the algorithm less suitable for picking up less prominent oscillatory signals, i.e. gamma, theta, ripples, etc. Is CHO only seeing the ‘tip of the iceberg’?

      Response:  We performed the suggested analysis. For the theta band, this analysis generated convincing statistical results for ECoG signals (Figures 5, 6, and the results section). For theta oscillation detection, we found no statistical difference between CHO and FOOOF.  Since FOOOF has a high sensitivity even under SNRs (as shown in our simulation), our analysis suggests that CHO and FOOOF should perform equally well in the detection of theta oscillation, even when the theta oscillation amplitude is small.

      To validate the ability of CHO to detect oscillations in high-frequency bands (> 40Hz), such as gamma oscillations and ripples, our follow-up study is applying CHO in the detection of highfrequency oscillations (HFOs) in electrocorticographic signals recorded during seizures.  To this end, our follow-up study analyzed 26 seizures from six patients.  In this analysis, CHO showed similar sensitivity and specificity as the epileptogenicity index (EI), which is the most commonly used method to detect seizure onset times and zones. The results of this follow-up study were presented at the American Epilepsy Society Meeting in December of 2023, and we are currently preparing a manuscript for submission to a peer-reviewed journal. 

      In this study, we want to investigate the performance of CHO in detecting the most prominent neural oscillations (e.g., alpha and beta). Future studies will investigate the performance of  CHO in detecting more difficult to observe oscillations (delta in sleep stages, theta in the hippocampus during memory tasks, and high-frequency oscillation or ripples in seizure or interictal data. 

      h) Methods: The methods section, especially the one describing the CHO algorithm, is lacking a lot of detail that one usually would like to see in order to rebuild the algorithm themselves. I appreciate that the code is available freely, but that does not, in my opinion, relief the authors of their duty to describe in detail how the algorithm works. This should be fixed before publishing.

      Response: We now present pseudo code to describe the algorithms within the new subsection on the hyper-parameterization of CHO.

      See Author response table 1.

      A new subsection titled “Tradeoffs in adjusting the hyper-parameters that govern the detection in CHO.”

      “The ability of CHO to detect neural oscillations and determine their fundamental frequency is governed by four principal hyper-parameters.  Adjusting these parameters requires understanding their effect on the sensitivity and specificity in the detection of neural oscillations. 

      The first hyper-parameter is the number of time windows (N in Line 5 in Algorithm 1), that is used to estimate the 1/f noise.  In our performance assessment of CHO, we used four time windows, resulting in estimation periods of 250 ms in duration for each 1/f spectrum.  A higher number of time windows results in smaller estimation periods and thus minimizes the likelihood of observing multiple neural oscillations within this time window, which otherwise could confound the 1/f estimation.  However, a higher number of time windows and, thus, smaller time estimation periods may lead to unstable 1/f estimates. 

      The second hyper-parameter defines the minimum number of cycles of a neural oscillation to be detected by CHO (see Line 23 in Algorithm 1).  In our study, we specified this parameter to be two cycles.  Increasing the number of cycles increases specificity, as it will reject spurious oscillations.  However, increasing the number also sensitivity as it will reject short oscillations.

      The third hyper-parameter is the significance threshold that selects positive peaks within the auto-correlation of the signal.  The magnitude of the peaks in the auto-correlation indicates the periodicity of the oscillations (see Line 26 in Algorithm 1).  Referred to as "NumSTD," this parameter denotes the number of standard errors that a positive peak has to exceed to be selected to be a true oscillation.  For this study, we set the "NumSTD" value to 1 (the approximate 68% confidence bounds).  Increasing the "NumSTD" value increases specificity in the detection as it reduces the detection of spurious peaks in the auto-correlation.  However, increasing the "NumSTD" value also decreases the sensitivity in the detection of neural oscillations with varying instantaneous oscillatory frequencies. 

      The fourth hyper-parameter is the percentage of overlap between two bounding boxes that trigger their merger (see Line 31 in Algorithm 1).  In our study, we set this parameter to 75% overlap.  Increasing this threshold yields more fragmentation in the detection of oscillations, while decreasing this threshold may reduce the accuracy in determining the onset and offset of neural oscillations.”

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      In this study, the authors investigate the tolerance of aminoglycosides in E. coli mutants deleted in the Krebs cycle and respiratory chain enzymes. The motivation for this study is unclear. Transport of aminoglycosides is pmf-dependent, as the authors correctly note, and knocking out energy-producing components leads to tolerance of aminoglycosides, this has been well established. In S. aureus, clinically relevant "small colony" strains selected for in the course of therapy with aminoglycosides acquire null mutations in the biosynthesis of heme or ubiquinone, and have been studied in detail. In E. coli, such knockouts have not been reported in clinical isolates, probably due to severe fitness costs.

      Response: We sincerely appreciate the time and consideration the reviewer dedicated to evaluating our manuscript. It's important to highlight that while the transport of aminoglycosides is PMF-dependent, recent studies underscore the potential role of metabolic mutations in antibiotic tolerance, a facet that warrants further investigation. For instance, the study by Henimann’s and Michiels' groups explored genomic changes in E. coli strains (including uropathogenic UTI89 strains) subjected to daily antibiotic exposure (Van den Bergh et al., 2022). Notably, mutations predominantly occurred in genes of the nuo operon, a key component of E. coli energy metabolism, suggesting a link between metabolic adaptations and antibiotic tolerance. Furthermore, the research by Collin's group revealed previously unrecognized genes related to central metabolism (e.g., icd, gltD, sucA) that contribute to antibiotic resistance in E. coli cells exposed to multiple antibiotics, including aminoglycosides (Lopatkin et al., 2021). These findings are corroborated by the presence of similar mutations in clinical E. coli pathogens, as evidenced by the analysis of a large library of 7243 E. coli genomes from NCBI Pathogen Detection (Lopatkin et al., 2021). The clinical relevance of metabolic mutations in antibiotic tolerance is increasingly recognized, yet their underlying mechanisms remain enigmatic. Therefore, elucidating the role of metabolic pathways in conferring antibiotic tolerance is highly critical. We have updated the introduction to clearly convey our motivation in this study (see page 4).

      At the same time, single-cell analysis has shown that individual cells with a decrease in the expression of Krebs cycle enzymes are tolerant of antibiotics and have lower ATP (Manuse et al., PLoS Biol 19: e3001194). The authors of the study under review report that knocking out ICD, isocitrate dehydrogenase that catalyzes the rate-limiting step in the Krebs cycle, has little effect on aminoglycoside tolerance and actually leads to an increase in the level of ATP over time. This observation does not seem to make much sense and contradicts previous reports, specifically that E. coli ICD is tolerant of antibiotics and, not surprisingly, produces Less ATP (Kabir and Shimizu, Appl Micro-biol Biotechnol. 2004; 65(1):84-96; Manuse et al., PLoS Biol 19: e3001194). Mutations in other Krebs cycle enzymes, unlike ICD, do lead to a dramatic increase in tolerance of aminoglycosides according to the paper under review. This is all very confusing.

      Response: Although our data cannot be directly compared to that of Kabir and Shimizu (Mohiuddin Kabir and Shimizu, 2004), due to the utilization of entirely different experimental procedures and measurement techniques, we can draw some parallels to the study conducted by Lewis’ group (Manuse et al., 2021), despite certain differences in experimental protocols. Furthermore, the reviewer has made strong assertions regarding our manuscript based on the findings of Lewis’ group. Thus, we believe it's pertinent to expand our response regarding that study.

      In the study of Lewis’ group, bacterial cells were inoculated at a ratio of 1:100 into LB medium from an overnight culture (approximately 16 hours). Subsequently, the cultures were incubated at 37°C for approximately 2 hours, and ATP levels were measured using the BacTiter Glo kit (Promega, Madison, WI, USA). ATP levels were then normalized to cell density, determined through optical density measurements, and represented on a linear diagram. As demonstrated in Supplementary Figure S1c of their paper, there was a 10-15% reduction in normalized ATP levels in the icd mutant compared to the wild type. In our experiments, cells were grown for 24 hours in overnight cultures, diluted 100-fold in fresh media, and ATP levels were measured at 3, 4, 5, and 6 hours using the same kit. ATP levels were normalized to cell counts quantified by flow cytometry. Upon analyzing our data of the icd mutant for around 3 hours (the time point closest to that of the study of Lewis’ group), we observed a reduction of approximately 15-20% (without statistical significance) in the icd mutant compared to the wild-type (see raw data, linear plot, and logarithmic plot below; Author response image 1), which aligns with the findings of Lewis’ group.

      We further investigated the gentamicin tolerance of both wild-type and icd mutant strains of E. coli BW25113 (Author response image 2). Our findings indicate that the increased sensitivity of the icd mutant of the MG1655 strain to gentamicin is similar to the observation in the other E. coli strain.

      Author response image 1.

      ATP levels in the icd mutant. ATP levels of both the mutant and wild-type strains were measured at t=3 hours of cell growth and normalized to cell counts. The figure presents the raw data (a), linear plot (b), and logarithmic plot (c) of the same dataset. This data corresponds to the first panel of Figure 3B in the manuscript.

      Author response image 2.

      Gentamicin tolerance of wild-type and icd mutant strains of E. coli BW25113. Both wild type and mutant strains were treated with gentamicin (50 µg/ml) for 5 hours at the mid-exponential phase. Cells were plated before and after treatment for CFU/ml counts. The dashed line represents the limit of detection. CFU: Colony forming units.

      We think that there are two primary reasons why our study cannot contradict the findings of the Lewis group:

      Firstly, our study cannot be directly compared to theirs, as they did not comprehensively explore the impact of gene deletions on cell metabolism beyond the measurement of ATP levels at a single time point (Manuse et al., 2021). Our study encompasses various metabolic parameters such as cellular ATP, redox status, proton motive force (PMF), intracellular pH, and drug uptake throughout the exponential and/or early stationary phase. Additionally, we conducted proteomic analysis for five different strains including mutants and wild type. Moreover, we performed pathway enrichment analysis grounded in the statistical background of the entire genome, encompassing various functional pathway classification frameworks such as Gene Ontology annotations, KEGG pathways, and Uniprot keywords. The results of these pathway enrichment analyses are now available in the Supplementary File (see Supplementary Tables 11-17 in the current manuscript). Thus, we believe it is unjust to deem our study contradictory compared to the Lewis group's study, which does not have a comprehensive analysis of the metabolism of the mutant strains they investigated.

      Secondly, our study cannot be compared to that specific study (Manuse et al., 2021) due to the utilization of a distinct antibiotic (ciprofloxacin). Cell tolerance is heavily reliant on the mechanism of action of the antibiotic used. Therefore, the reviewer should have focused on studies closely related to aminoglycoside tolerance. Our study is not confusing or contradictory, as Lewis’ group also demonstrated that the tolerance of the icd mutant to gentamicin was significantly reduced while the tolerance of other TCA cycle mutant strains was increased in a different study (Shan et al., 2015). However, they did not delve into the metabolism of these mutant strains, as we did. We now mention this point in our manuscript (see pages 14-15).

      Apart from the confusing data, it is not clear what useful information may be obtained from the choice of the experimental system. The authors examine exponentially growing cells of E. coli for tolerance of aminoglycosides. The population at this stage of growth is highly susceptible to aminoglycosides, and only some rare persister cells can survive. However, the authors do not study persisters. A stationary population of E. coli is tolerant of aminoglycosides, and this is clinically relevant, but this is not the subject of the study.

      Response: Respectfully, we must express our disagreement with the reviewer's comments. Our experimental system is meticulously organized and logically structured. Mutant strains such as gltA, sucA, and nuoI deletions exhibit increased tolerance to all aminoglycosides tested, with their fractions clearly increasing around the mid-exponential phase between 3-4 hours (refer to Figure 2B in our manuscript). This surge in tolerance is evident at the population level as well (as depicted in Figure 1A in our manuscript, where certain mutant strains demonstrate complete survival to streptomycin, with survival fractions nearing 1). Given the pronounced increase observed around the mid-exponential phase, we primarily characterize the metabolism of these cells during this growth phase.

      It's essential to note that any investigation into antibiotic tolerance and/or resistance holds immense significance, regardless of the growth phase under scrutiny, as antibiotic tolerance/resistance poses a substantial healthcare challenge. Additionally, metabolic mutant strains do not necessarily entail severe fitness costs, as evidenced by Figure S2A published by the Lewis group (Manuse et al., 2021), a finding consistent with our study (see Figure 2B in our manuscript). This phenomenon could confer a survival advantage to bacterial cells, as they may acquire metabolic mutations to bolster their tolerance without incurring significant fitness costs. Furthermore, numerous studies suggest that bacterial cells may opt for the evolutionary pathway leading to increased tolerance before acquiring resistance mechanisms (Levin-Reisman et al., 2017; Santi et al., 2021). The presence of metabolic mutations in clinical E. coli pathogens has also been confirmed through the analysis of a large library of 7243 E. coli genomes from NCBI Pathogen Detection by Collin’s group (Lopatkin et al., 2021). Consequently, comprehending the tolerance mechanisms of metabolic mutations holds paramount importance.

      References

      Levin-Reisman I, Ronin I, Gefen O, Braniss I, Shoresh N, Balaban NQ. 2017. Antibiotic tolerance facilitates the evolution of resistance. Science (1979) 355:826–830. doi:10.1126/science.aaj2191

      Lopatkin AJ, Bening SC, Manson AL, Stokes JM, Kohanski MA, Badran AH, Earl AM, Cheney NJ, Yang JH, Collins JJ. 2021. Clinically relevant mutations in core metabolic genes confer antibiotic resistance. Science (1979) 371. doi:10.1126/science.aba0862

      Manuse S, Shan Y, Canas-Duarte SJ, Bakshi S, Sun WS, Mori H, Paulsson J, Lewis K. 2021. Bacterial persisters are a stochastically formed subpopulation of low-energy cells. PLoS Biol 19. doi:10.1371/journal.pbio.3001194

      Mohiuddin Kabir M, Shimizu K. 2004. Metabolic regulation analysis of icd-gene knockout Escherichia coli based on 2D electrophoresis with MALDI-TOF mass spectrometry and enzyme activity measurements. Appl Microbiol Biotechnol 65:84–96. doi:10.1007/s00253-004-1627-1

      Santi I, Manfredi P, Maffei E, Egli A, Jenal U. 2021. Evolution of Antibiotic Tolerance Shapes Resistance Development in Chronic Pseudomonas aeruginosa Infections. doi:10.1128/mBio.03482-20

      Shan Y, Lazinski D, Rowe S, Camilli A, Lewis K. 2015. Genetic basis of persister tolerance to aminoglycosides in Escherichia coli. mBio 6. doi:10.1128/mBio.00078-15

      Van den Bergh B, Schramke H, Michiels JE, Kimkes TEP, Radzikowski JL, Schimpf J, Vedelaar SR, Burschel S, Dewachter L, Lončar N, Schmidt A, Meijer T, Fauvart M, Friedrich T, Michiels J, Heinemann M. 2022. Mutations in respiratory complex I promote antibiotic persistence through alterations in intracellular acidity and protein synthesis. Nat Commun 13:546. doi:10.1038/s41467-022-28141-x

      Reviewer #2 (Public Review):

      Summary:

      This interesting study challenges a dogma regarding the link between bacterial metabolism decrease and tolerance to aminoglycosides (AG). The authors demonstrate that mutants well-known for being tolerant to AG, such as those of complexes I and II, are not so due to a decrease in the proton motive force (PMF) and thus antibiotic uptake, as previously reported in the literature.

      Strengths:

      This is a complete study. These results are surprising and are based on various read-outs, such as ATP levels, pH measurement, membrane potential, and the uptake of fluorophore-labeled gentamicin. Utilizing a proteomic approach, the authors show instead that in tolerant mutants, there is a decrease in the levels of proteins associated with ribosomes (targets of AG), causing tolerance.

      Response: We sincerely appreciate the reviewer for taking the time to read our manuscript and offer valuable suggestions.

      Weaknesses:

      The use of a single high concentration of aminoglycoside: my main comment on this study concerns the use of an AG concentration well above the MIC (50 µg/ml or 25 µg/ml for uptake experiments), which is 10 times higher than previously used concentrations (Kohanski, Taber) in study showing a link with PMF. This significant difference may explain the discrepancies in results. Indeed, a high concentration of AG can mask the effects of a metabolic disruption and lead to less specific uptake. However, this concentration highlights a second molecular level of tolerance. Adding experiments using lower concentrations (we propose 5 µg/ml to compare with the literature) would provide a more comprehensive understanding of AG tolerance mechanisms during a decrease in metabolism.

      Another suggestion would be to test iron limitation (using an iron chelator as DIP), which has been shown to induce AG tolerance. Can the authors demonstrate if this iron limitation leads to a decrease in ribosomal proteins? This experiment would validate their hypothesis in the case of a positive result. Otherwise, it would help distinguish two types of molecular mechanisms for AG tolerance during a metabolic disruption: (i) PMF and uptake at low concentrations, (ii) ribosomal proteins at high concentrations.

      Response: While we acknowledge the intriguing possibility of exploring whether iron limitation results in a reduction of ribosomal proteins, we believe that this topic falls slightly outside the scope of our current study. This area warrants independent investigation since our current research did not specifically focus on iron-limited environments (LB medium is iron-rich, as referenced (Abdul-tehrani et al., 1999; Rodríguez-Rojas et al., 2015)). However, we fully concur with the notion that experimental outcomes may be contingent upon the concentration of aminoglycosides (AG). Hence, we repeated the critical experiments using a lower concentration of gentamicin (5 µg/mL), as suggested by the reviewer. Before delving into a discussion of these results, we wish to emphasize two key points. Firstly, the majority of our metabolic measurements, including ATP levels, redox activities, intracellular pH, and metabolomics, were conducted in mutant and wild-type cells in the absence of drugs. Our objective was to elucidate the impact of genetic perturbations of the TCA cycle on cell metabolism. Secondly, it's important to emphasize that our study does not invalidate the hypothesis that AG uptake is proton motive force (PMF)-dependent. We observed similar drug uptake across the strains tested, which is reasonable considering that their energy metabolism and PMF are not significantly altered compared to the wild type (at least we did not observe a consistent trend in their metabolic levels). Consequently, our study does not necessarily contradict with previous claims (Taber Harry W et al., 1987). We have now clarified this point in the manuscript (see pages 1 and 13).

      When we employed a lower gentamicin concentration, we still noted a significant elevation in tolerance among the gltA, sucA, and nuoI mutant strains compared to the wild type. Also, it remained evident that the observed tolerance in the mutant strains cannot be ascribed to differences in drug uptake or impaired PMF, as the levels of drug uptake and the disruption of PMF by gentamicin (at lower concentrations) in the mutant strains were comparable to those of the wild type. Moreover, since our metabolic measurements and proteomics analyses failed to reveal any notable alterations in energy metabolism in these strains, the consistency in drug uptake levels across both mutant and wild-type strains, even at lower concentrations, further bolsters the validity of our findings obtained at higher gentamicin concentrations. The new results have been incorporated into the Supplementary file (see Supplementary Figures S1, S5, S7, and S9) and discussed throughout the manuscript.

      Recommendations for the authors:

      Reviewer #2 (Recommendations For The Authors):

      Line 120: Luria-Bertani (LB), used Lysogeny Broth.

      Line 180: "RSG dye can be reduced by bacterial reductases of PMF" to be reformulated.

      Response: The suggested corrections have been incorporated into the manuscript.

      References

      Abdul-tehrani H, Hudson AJ, Chang Y, Timms AR, Hawkins C, Williams JM, Harrison PM, Guest JR, Andrews SC. 1999. Ferritin Mutants of Escherichia coli Are Iron Deficient and Growth Impaired, and fur Mutants are Iron Deficient, Journal of Bacteriology.

      Rodríguez-Rojas A, Makarova O, Müller U, Rolff J. 2015. Cationic Peptides Facilitate Iron-induced Mutagenesis in Bacteria. PLoS Genet 11. doi:10.1371/journal.pgen.1005546

      Taber Harry W, Mueller JP, Miller PF, Arrow AS. 1987. Bacterial Uptake of Aminoglycoside Antibiotics. Microbiol Rev 51:439–457. doi:10.1128/mr.51.4.439-457.1987

    1. eLife assessment

      In this study, the authors found that a species of aphid that is a known agricultural pest salivated longer and produced more honeydew when feeding at night. The authors identified aphid genes with diurnal expression patterns, including potential saliva-related genes. Silencing these genes reduced aphid performance only on real plants, suggesting a specific role in plant feeding. While this study is valuable for understanding plant-insect interactions in agriculture, it is currently incomplete, as further research is needed to elucidate the function of the identified genes.

    2. Reviewer #1 (Public Review):

      Summary :

      This study presents valuable data on diurnal patterns in aphid (Rhopalosiphum padi) feeding behavior and transcriptome profiles. The authors measured honeydew production by the aphids on plants and artificial diet during the day and night and conducted a comprehensive feeding behavior study using EPG with many biological replicates at 6 time-points in 24 hours. They also conducted transcriptome analyses of three samples of each 30 aphids at these time points. Differentially expressed transcripts were grouped into four clusters with distinct expression patterns. The expression of two genes found to be diurnally rhythmic was knocked down with RNAi and these aphids did less well, especially at night. They also analyzed the differential expression of candidate effector genes and found rhythmic ones to be enriched for more expression in aphid heads versus bodies - this pattern is expected given that effectors are most likely expressed in the salivary glands. Knockdown of a known effector (C002) that is diurnally rhythmic, and a novel effector gene, was found to alter aphid feeding dynamics and performance.

      Strengths:

      The manuscript was highly accessible, with clear writing, and the figures provided were both comprehensive and of good quality. The datasets generated from this research are valuable to the research field, especially the findings for honeydew secretion, EPG analysis, and transcriptome experiments.

      The datasets generated in this study will be useful to scientists working on aphids and aphid-plant interactions and will inform similar studies on other insect species.

      Weaknesses:

      The weaknesses mainly relate to the (depth of) analyses and interpretation of the data. Also, some methods require more explanation, as follows:

      In Figure 1, data show that aphids produce more honeydew at night than during the day. This suggests that the aphids ingest more phloem (E2 phase). However, in Figure 1d the duration of the E2 phase does not show obvious differences among the time points in the 24 hours. The authors contribute the explanation that the aphids may osmoregulate more during the night, leading to more honeydew secretion at night. This may be the case, but there could be other explanations. For example, the physiology, including regulation of water transport, of plants is known to change during night/day. The authors may focus this section more on the differences in the E1 phase, as this involves the delivery of aphid saliva and effectors into the plant phloem.

      Transcriptome data shown in Figure 2 (and the experimental procedure of Figure 5b) appears to be based on three biological replicates. However, these replicates appear to have been harvested at the same time in the experiment, and this makes them technical replicates, not biological replicates. The inclusion of true biological replicates that include samples from time series experiments done on different days should be considered.

      The authors conducted knockdown experiments targeting aquaporin 1 and gut sucrase 1 in aphids, resulting in reduced nymph production and decreased honeydew secretion. It is concluded that these results indicate significant roles of aquaporin 1 and gut sucrase 1 in diurnal regulation. However, it is essential to consider that these genes likely play crucial roles in aphid physiology beyond diurnal rhythms. Consequently, reduced expression would naturally impair aphid performance. The dsAQP1 and dsSUC1 aphids consistently produced less honeydew, regardless of the time of day, indicating a broader impact of gene knockdown. The observed increase of the phenotype at night may not be attributable to the specific roles of these genes in diurnal regulation but rather due to heightened aphid activity during that time (as evidenced by increased honeydew secretion) that could magnify the impact of the knockdown effect, making it easier to observe. Therefore, the knockdown of aquaporin 1 and gut sucrase 1 may exert a general negative influence on aphid fitness, independently of diurnal factors.

      To analyze the roles of genes in diurnal regulation, additional controls should be incorporated. This could involve the knockdown of genes with essential functions that are not influenced by diurnal rhythms, providing a baseline comparison. Furthermore, consider including genes known to be involved in diurnal regulation in other insects, as documented in the existing literature, in the experimental design.

      The same arguments as for aquaporin 1 and gut sucrase 1 above may be made for knockdown of effector genes (Figure 4). It has already been shown that knockdown of C002 impacts aphid performance, and the data herein may be explained by a general lower performance of aphids rather than a specific function of these effectors in diurnal regulation. It is also expected that knockdown of the effectors has less impact on aphids feeding from artificial diets. This does not necessarily indicate the role of the effectors in diurnal regulation.

      In the abstract and elsewhere, the authors assert priority by stating, "...the first evidence of...". However, it's important to note that priority claims are often challenging to verify across many fields. Instead of relying solely on claims of precedence, the evidence presented in the research could stand on its own merit.

      Conclusion:

      The study presents intriguing new findings, particularly in the realms of honeydew analysis, EPG, and transcriptome analysis. However, the interpretation of subsequent studies employing gene knockdowns needs further consideration.

    3. Reviewer #2 (Public Review):

      Summary:

      The authors conducted a time-course of whole-body transcriptional analysis of a pest aphid, Rhopalosiphum padi, and identified four major clusters of the genes that show diurnal rhythmicity in transcription. In addition, they conducted the analysis of aphid feeding behaviour and showed that aphids salivate longer from the end of the day toward the beginning of the night while their phloem feeding time does not change throughout the day. The genes up-regulated at night time were enriched with the genes involved in metabolic activities, collaborating with the results showing a higher number of honeydew excretion at night. The authors identified the list of candidate salivary genes that show diurnal rhythmicity in the transcription and silenced a salivary gene C002 and the candidate salivary gene E8696. Silencing of these genes reduced aphid fecundity and survival rate on the host plant but not on the artificial diet.

      Strengths:

      The time-course transcription study and its analysis will be of interest to researchers studying diurnal rhythms in insect biology. Also, the analysis of aphid feeding behaviour at different times of day is interesting. This study provides variable resources for those who study insect biology.

      Weaknesses:

      It is not clear to me which data was used to define the putative salivary effectors for R. padi, but the candidate salivary gene list made by Thorpe et al consists of the aphid genes encoding secreted proteins that are up-regulated in the head samples compared to the body samples. Although some proteins were confirmed to be secreted into the aphid saliva, many genes in the list are not confirmed to be expressed in the aphid salivary glands, and their products are not confirmed to be secreted into the saliva and the plant. Is E8696 expressed in the aphid salivary glands and secreted into its host plant? Without the data confirming the expression of the gene in the salivary glands and its secretion into the saliva and into the host plant, we cannot call the protein a salivary protein. Furthermore, without the observation that E8696 has some effect on plant biology, we cannot call it an aphid effector. Therefore, I cannot agree with the parts of the manuscript that refer to E8686 as an aphid salivary effector.

      It is interesting to know that some candidate salivary gene expression showed a diurnal rhythm. However, without the knowledge of the functions of the salivary effectors, especially their targets, it is not possible to conclude that the rhythmical expression is important for the aphid performance. In addition, I wonder whether the increase in gene expression is directly correlated with the increase of protein secretion into the saliva and the plant.

      Finally, the authors examined aphid survival, fecundity, and feeding behaviour. Those are important for overall aphid performance, but they do not "shape" aphid colonization. Aphid colonisation is shaped by the mechanisms by which aphids find and select their host plant and start to feed on it. Therefore, I do not agree with the title of this manuscript and some parts of the discussion.

      I would like the authors to develop how the knowledge of the diurnal rhythm of aphid feeding can contribute to optimise pest management. I see that there are some differences in aphid metabolism and feeding behaviour between day and night, but I would like to hear how such knowledge can optimise pest management strategies.

    1. eLife assessment

      This fundamental study investigates the transcriptional changes in neurons that underlie loss of learning and memory with age in C. elegans, and how cognition is maintained in insulin/IGF-1-like signaling mutants. The presented evidence is compelling, utilizing a cutting-edge method to isolate neurons from worms for genomics that is clearly conveyed with a rigorous experimental approach. Overall, this study supports that older daf-2 worms maintain cognitive function via mechanisms that are unique from younger wild type worms, which will be of great interest to neuroscientists and researchers studying ageing.

    2. Reviewer #1 (Public Review):

      The authors perform RNA-seq on FACS isolated neurons from adult worms at days 1 and 8 of adulthood to profile the gene expression changes that occur with cognitive decline. Supporting data are included indicating that by day 7 of adulthood, learning and memory are reduced, indicating that this timepoint or after represents cognitively aged worms. Neuronal identity genes are reduced in expression within the cognitively aged worms, whereas genes involved in proteostasis, transcription/chromatin, and the stress response are elevated. A number of specific examples are provided, representing markers of specific neuronal subtypes, and correlating expression changes to the erosion of particular functions (e.g. motor neurons, chemosensory neurons, aversive learning neurons, etc).

      To investigate whether upregulation of genes in neurons with age is compensatory or deleterious, the authors reduced expression of a set of three significantly upregulated genes and performed behavioral assays in young adults. In each case, reduction of expression improved memory, consistent with a model in which age-associated increases impair neuronal function.

      The authors then characterize learning and memory in wild type, daf-2, and daf-2/daf-16 worms with age and find that daf-2 worms have an extended ability to learn for approximately 10 days longer that wild types. This was daf-16 dependent. Memory was extended in daf-2 as well, and strikingly, daf-2;daf-16 had no short term memory even at day 1. Transcriptomic analysis of FACS-sorted neurons was performed on the three groups at day 8. The authors focus their analysis on daf-2 vs. daf-2;daf-16 and present evidence that daf-2 neurons express a stress-resistance gene program. They also find small differences between the N2 and daf-2;daf-16 neurons, which correlate with the observed behavioral differences, though these differences are modest.

      The authors tested eight candidate genes that were more highly expressed in daf-2 neurons vs. daf-2;daf-16 and showed that reduction of 2 and 5 of these genes impaired learning and memory, respectively, in daf-2 worms. This finding implicates specific neuronal transcriptional targets of IIS in maintaining cognitive ability in daf-2 with age, which, importantly, are distinct from those in young wild type worms.

      Overall, this is a strong study with rigorously performed experiments. The authors achieved their aim of identifying transcriptional changes in neurons that underlie loss of learning and memory in C. elegans, and how cognition is maintained in insulin/IGF-1-like signaling mutants.

    3. Reviewer #2 (Public Review):

      Weng et al. perform a comprehensive study of gene expression changes in young and old animals, in wild-type and daf-2 insulin receptor mutants, in the whole animal and specifically in the nervous system. Using this data, they identify gene families that are correlated with neuronal ageing, as well as a distinct set of genes that are upregulated in neurons of aged daf-2 mutants. This is particularly interesting as daf-2 mutants show both extended lifespan and healthier neurons in aged animals, reflected by better learning/memory in older animals compared with wild-type controls. Indeed, knockdown of several of these upregulated genes resulted in poorer learning and memory. In addition, the authors showed that several genes upregulated during ageing in wild-type neurons also contribute to learning and memory; specifically, knockdown of these genes in young animals resulted in improved memory. This indicates that (at least in this small number of cases), genes that show increased transcript levels with age in the nervous system somehow suppress memory, potentially by having damaging effects on neuronal health.

      Finally, from a resource perspective, the neuronal transcriptome provided here will be very useful for C. elegans researchers as it adds to other existing datasets by providing the transcriptome of older animals (animals at day 8 of adulthood) and demonstrating the benefits of performing tissue-specific RNAseq instead of whole-animal sequencing.

      The work presented here is of high quality and the authors present convincing evidence supporting their conclusions.

    4. Reviewer #3 (Public Review):

      Summary

      In this manuscript, Weng et al. identify the neuron specific transcriptome that impacts age dependent cognitive decline. The authors design a pipeline to profile neurons from wild type and long-lived insulin receptor/IGF-1 mutants using timepoints when memory functions are declining. They discover signatures unique to neurons which validates their approach. The authors identify that genes related to neuronal identity are lost with age in wild type worms. For example, old neurons reduce the expression of genes linked to synaptic function and neuropeptide signaling and increase the expression of chromatin regulators, insulin peptides and glycoproteins. Depletion of selected genes which are upregulated in old neurons (utx-1, ins-19 and nmgp-1) leads to improved short memory function. This indicates that some genes that increase with age have detrimental effects on learning and memory. The pipeline is then used to test neuronal profiles of long-lived insulin/IGF-1 daf-2 mutants. Genes related to stress response pathways are upregulated in long lived daf-2 mutants (e.g. dod-24, F08H9.4) and those genes are required for improved neuron function.

      Strengths

      The manuscript is well written, and the experiments are well described. The authors take great care to explain their reasoning for performing experiments in a specific way and guide the reader through the interpretation of the results, which makes this manuscript an enjoyable and interesting read. The authors discover novel regulators of learning and memory using neuron-specific transcriptomic analysis in aged animals, which underlines the importance of cell specific deep sequencing. The timepoints of the transcriptomic profiling are elegantly chosen, as they coincide with the loss of memory and can be used to specifically reveal gene expression profiles related to neuron function. The authors discuss on the dod-24 example how powerful this approach is. In daf-2 mutants whole-body dod-24 expression differs from neuron specific profiles, which underlines the importance of precise cell specific approaches. This dataset provides a very useful resource for the C. elegans and aging community as it complements existing datasets with additional time points and neuron specific deep profiling.

    5. Author response:

      The following is the authors’ response to the previous reviews.

      eLife assessment

      This fundamental study investigates the transcriptional changes in neurons that underlie loss of learning and memory with age in C. elegans, and how cognition is maintained in insulin/IGF-1-like signaling mutants. The presented evidence is compelling, utilizing a cutting-edge method to isolate neurons from worms for genomics that is clearly conveyed with a rigorous experimental approach. Overall, this study supports that older daf-2 worms maintain cognitive function via mechanisms that are unique from younger wild type worms, which will be of great interest to neuroscientists and researchers studying ageing.

      Public Reviews:

      Reviewer #1 (Public Review):

      The authors perform RNA-seq on FACS isolated neurons from adult worms at days 1 and 8 of adulthood to profile the gene expression changes that occur with cognitive decline. Supporting data are included indicating that by day 7 of adulthood, learning and memory are reduced, indicating that this timepoint or after represents cognitively aged worms. Neuronal identity genes are reduced in expression within the cognitively aged worms, whereas genes involved in proteostasis, transcription/chromatin, and the stress response are elevated. A number of specific examples are provided, representing markers of specific neuronal subtypes, and correlating expression changes to the erosion of particular functions (e.g. motor neurons, chemosensory neurons, aversive learning neurons, etc).

      To investigate whether upregulation of genes in neurons with age is compensatory or deleterious, the authors reduced expression of a set of three significantly upregulated genes and performed behavioral assays in young adults. In each case, reduction of expression improved memory, consistent with a model in which age-associated increases impair neuronal function.

      The authors then characterize learning and memory in wild type, daf-2, and daf-2/daf-16 worms with age and find that daf-2 worms have an extended ability to learn for approximately 10 days longer that wild types. This was daf-16 dependent. Memory was extended in daf-2 as well, and strikingly, daf-2;daf-16 had no short term memory even at day 1. Transcriptomic analysis of FACS-sorted neurons was performed on the three groups at day 8. The authors focus their analysis on daf-2 vs. daf-2;daf-16 and present evidence that daf-2 neurons express a stress-resistance gene program. They also find small differences between the N2 and daf-2;daf-16 neurons, which correlate with the observed behavioral differences, though these differences are modest.

      The authors tested eight candidate genes that were more highly expressed in daf-2 neurons vs. daf-2;daf-16 and showed that reduction of 2 and 5 of these genes impaired learning and memory, respectively, in daf-2 worms. This finding implicates specific neuronal transcriptional targets of IIS in maintaining cognitive ability in daf-2 with age, which, importantly, are distinct from those in young wild type worms.

      Overall, this is a strong study with rigorously performed experiments. The authors achieved their aim of identifying transcriptional changes in neurons that underlie loss of learning and memory in C. elegans, and how cognition is maintained in insulin/IGF-1-like signaling mutants. 

      We thank you for the evaluation and response.

      Reviewer #2 (Public Review):

      Weng et al. perform a comprehensive study of gene expression changes in young and old animals, in wild-type and daf-2 insulin receptor mutants, in the whole animal and specifically in the nervous system. Using this data, they identify gene families that are correlated with neuronal ageing, as well as a distinct set of genes that are upregulated in neurons of aged daf-2 mutants. This is particularly interesting as daf-2 mutants show both extended lifespan and healthier neurons in aged animals, reflected by better learning/memory in older animals compared with wild-type controls. Indeed, knockdown of several of these upregulated genes resulted in poorer learning and memory. In addition, the authors showed that several genes upregulated during ageing in wild-type neurons also contribute to learning and memory; specifically, knockdown of these genes in young animals resulted in improved memory. This indicates that (at least in this small number of cases), genes that show increased transcript levels with age in the nervous system somehow suppress memory, potentially by having damaging effects on neuronal health.

      Finally, from a resource perspective, the neuronal transcriptome provided here will be very useful for C. elegans researchers as it adds to other existing datasets by providing the transcriptome of older animals (animals at day 8 of adulthood) and demonstrating the benefits of performing tissue-specific RNAseq instead of whole-animal sequencing.

      The work presented here is of high quality and the authors present convincing evidence supporting their conclusions. I only have a few comments/suggestions:

      (1) Do the genes identified to decrease learning/memory capacity in daf-2 animals (Figure 4d/e) also impact neuronal health? daf-2 mutant worms show delayed onset of age-related changes to neuron structure (Tank et al., 2011, J Neurosci). Does knockdown of the genes shown to affect learning also affect neuron structure during ageing, potentially one mechanism through which they modulate learning/memory? 

      (2) The learning and memory assay data presented in this study uses the butanone olfactory learning paradigm, which is well established by the same group. Have the authors tried other learning assays when testing for learning/memory changes after knockdown of candidate genes? Depending on the expression pattern of these genes, they may have more or less of an effect on olfactory learning versus for e.g. gustatory or mechanosensory-based learning.

      (3) A comment on the 'compensatory vs dysregulatory' model as stated by the authors on page 7 - I understand that this model presents the two main options, but perhaps this is slightly too simplistic: gene expression that rises during ageing may be detrimental for memory (= dysregulatory), but at the same time may also be beneficial other physiological roles in other tissues (=compensatory). 

      Thank you for your original suggestions; we addressed them in the previous version of response to the reviewers.

      Comments on revised version:

      I am satisfied with how the authors have addressed all my comments/suggestions. 

      Thank you for your response!

      Reviewer #3 (Public Review):

      Summary

      In this manuscript, Weng et al. identify the neuron specific transcriptome that impacts age dependent cognitive decline. The authors design a pipeline to profile neurons from wild type and long-lived insulin receptor/IGF-1 mutants using timepoints when memory functions are declining. They discover signatures unique to neurons which validates their approach. The authors identify that genes related to neuronal identity are lost with age in wild type worms. For example, old neurons reduce the expression of genes linked to synaptic function and neuropeptide signaling and increase the expression of chromatin regulators, insulin peptides and glycoproteins. Depletion of selected genes which are upregulated in old neurons (utx-1, ins-19 and nmgp-1) leads to improved short memory function. This indicates that some genes that increase with age have detrimental effects on learning and memory. The pipeline is then used to test neuronal profiles of long-lived insulin/IGF-1 daf-2 mutants. Genes related to stress response pathways are upregulated in long lived daf-2 mutants (e.g. dod-24, F08H9.4) and those genes are required for improved neuron function.

      Strengths

      The manuscript is well written, and the experiments are well described. The authors take great care to explain their reasoning for performing experiments in a specific way and guide the reader through the interpretation of the results, which makes this manuscript an enjoyable and interesting read. The authors discover novel regulators of learning and memory using neuron-specific transcriptomic analysis in aged animals, which underlines the importance of cell specific deep sequencing. The timepoints of the transcriptomic profiling are elegantly chosen, as they coincide with the loss of memory and can be used to specifically reveal gene expression profiles related to neuron function. The authors discuss on the dod-24 example how powerful this approach is. In daf-2 mutants whole-body dod-24 expression differs from neuron specific profiles, which underlines the importance of precise cell specific approaches. This dataset will provide a very useful resource for the C. elegans and aging community as it complements existing datasets with additional time points and neuron specific deep profiling.

      Weakness

      This study nicely describes the neuron specific profiles of aged long-lived daf-2 mutants. Selected neuronal genes that were upregulated in daf-2 mutants (e.g. F08H9.4, mtl-1, dod-24, alh-2, C44B7.5) decreased learning/memory when knocked down. However, the knock down of these genes was not specific to neurons. The authors use a neuron-sensitive RNAi strain to address this concern and acknowledge this caveat in the text. While it is likely that selected candidates act only in neurons it is possible that other tissues participate as well.

      Thank you for pointing this caveat out. We have mentioned it in the figure legend.

    1. eLife assessment

      This computational modeling study builds on multiple previous lines of experimental and theoretical research to investigate how a single neuron can solve a nonlinear pattern classification task. The study presents valuable insights that the location of synapses on dendritic branches, as well as synaptic plasticity of excitatory and inhibitory synapses, influences the ability of a neuron to discriminate combinations of sensory stimuli. However, the evidence presented is incomplete - the major conclusions are only partially supported by the data presented, and there are identified gaps between the supporting evidence and the major conclusions.

    2. Reviewer #1 (Public Review):

      Summary:

      This computational modeling study builds on multiple previous lines of experimental and theoretical research to investigate how a single neuron can solve a nonlinear pattern classification task. The authors construct a detailed biophysical and morphological model of a single striatal medium spiny neuron, and endow excitatory and inhibitory synapses with dynamic synaptic plasticity mechanisms that are sensitive to (1) the presence or absence of a dopamine reward signal, and (2) spatiotemporal coincidence of synaptic activity in single dendritic branches. The latter coincidence is detected by voltage-dependent NMDA-type glutamate receptors, which can generate a type of dendritic spike referred to as a "plateau potential." The proposed mechanisms result in moderate performance on a nonlinear classification task when specific input features are segregated and clustered onto individual branches, but reduced performance when input features are randomly distributed across branches. Given the high level of complexity of all components of the model, it is not clear which features of which components are most important for its performance. There is also room for improvement in the narrative structure of the manuscript and the organization of concepts and data.

      Strengths:

      The integrative aspect of this study is its major strength. It is challenging to relate low-level details such as electrical spine compartmentalization, extrasynaptic neurotransmitter concentrations, dendritic nonlinearities, spatial clustering of correlated inputs, and plasticity of excitatory and inhibitory synapses to high-level computations such as nonlinear feature classification. Due to high simulation costs, it is rare to see highly biophysical and morphological models used for learning studies that require repeated stimulus presentations over the course of a training procedure. The study aspires to prove the principle that experimentally-supported biological mechanisms can explain complex learning.

      Weaknesses:

      The high level of complexity of each component of the model makes it difficult to gain an intuition for which aspects of the model are essential for its performance, or responsible for its poor performance under certain conditions. Stripping down some of the biophysical detail and comparing it to a simpler model may help better understand each component in isolation. That said, the fundamental concepts behind nonlinear feature binding in neurons with compartmentalized dendrites have been explored in previous work, so it is not clear how this study represents a significant conceptual advance. Finally, the presentation of the model, the motivation and justification of each design choice, and the interpretation of each result could be restructured for clarity to be better received by a wider audience.

    3. Reviewer #2 (Public Review):

      Summary:

      The study explores how single striatal projection neurons (SPNs) utilize dendritic nonlinearities to solve complex integration tasks. It introduces a calcium-based synaptic learning rule that incorporates local calcium dynamics and dopaminergic signals, along with metaplasticity to ensure stability for synaptic weights. Results show SPNs can solve the nonlinear feature binding problem and enhance computational efficiency through inhibitory plasticity in dendrites, emphasizing the significant computational potential of individual neurons. In summary, the study provides a more biologically plausible solution to single-neuron learning and gives further mechanical insights into complex computations at the single-neuron level.

      Strengths:

      The paper introduces a novel learning rule for training a single multicompartmental neuron model to perform nonlinear feature binding tasks (NFBP), highlighting two main strengths: the learning rule is local, calcium-based, and requires only sparse reward signals, making it highly biologically plausible, and it applies to detailed neuron models that effectively preserve dendritic nonlinearities, contrasting with many previous studies that use simplified models.

      Weaknesses:

      I am concerned that the manuscript was submitted too hastily, as evidenced by the quality and logic of the writing and the presentation of the figures. These issues may compromise the integrity of the work. I would recommend a substantial revision of the manuscript to improve the clarity of the writing, incorporate more experiments, and better define the goals of the study.

      Major Points:

      (1) Quality of Scientific Writing: The current draft does not meet the expected standards. Key issues include:

      i. Mathematical and Implementation Details: The manuscript lacks comprehensive mathematical descriptions and implementation details for the plasticity models (LTP/LTD/Meta) and the SPN model. Given the complexity of the biophysically detailed multicompartment model and the associated learning rules, the inclusion of only nine abstract equations (Eq. 1-9) in the Methods section is insufficient. I was surprised to find no supplementary material providing these crucial details. What parameters were used for the SPN model? What are the mathematical specifics for the extra-synaptic NMDA receptors utilized in this study? For instance, Eq. 3 references [Ca2+]-does this refer to calcium ions influenced by extra-synaptic NMDARs, or does it apply to other standard NMDARs? I also suggest the authors provide pseudocodes for the entire learning process to further clarify the learning rules.

      ii. Figure quality. The authors seem not to carefully typeset the images, resulting in overcrowding and varying font sizes in the figures. Some of the fonts are too small and hard to read. The text in many of the diagrams is confusing. For example, in Panel A of Figure 3, two flattened images are combined, leading to small, distorted font sizes. In Panels C and D of Figure 7, the inconsistent use of terminology such as "kernels" further complicates the clarity of the presentation. I recommend that the authors thoroughly review all figures and accompanying text to ensure they meet the expected standards of clarity and quality.

      iii. Writing clarity. The manuscript often includes excessive and irrelevant details, particularly in the mathematical discussions. On page 24, within the "Metaplasticity" section, the authors introduce the biological background to support the proposed metaplasticity equation (Eq. 5). However, much of this biological detail is hypothesized rather than experimentally verified. For instance, the claim that "a pause in dopamine triggers a shift towards higher calcium concentrations while a peak in dopamine pushes the LTP kernel in the opposite direction" lacks cited experimental evidence. If evidence exists, it should be clearly referenced; otherwise, these assertions should be presented as theoretical hypotheses. Generally, Eq. 5 and related discussions should be described more concisely, with only a loose connection to dopamine effects until more experimental findings are available.

      (2) Goals of the Study: The authors need to clearly define the primary objective of their research. Is it to showcase the computational advantages of the local learning rule, or to elucidate biological functions?

      i. Computational Advantage: If the intent is to demonstrate computational advantages, the current experimental results appear inadequate. The learning rule introduced in this work can only solve for four features, whereas previous research (e.g., Bicknell and Hausser, 2021) has shown capability with over 100 features. It is crucial for the authors to extend their demonstrations to prove that their learning rule can handle more than just three features. Furthermore, the requirement to fine-tune the midpoint of the synapse function indicates that the rule modifies the "activation function" of the synapses, as opposed to merely adjusting synaptic weights. In machine learning, modifying weights directly is typically more efficient than altering activation functions during learning tasks. This might account for why the current learning rule is restricted to a limited number of tasks. The authors should critically evaluate whether the proposed local learning rule, including meta-plasticity, actually offers any computational advantage. This evaluation is essential to understand the practical implications and effectiveness of the proposed learning rule.

      ii. Biological Significance: If the goal is to interpret biological functions, the authors should dig deeper into the model behaviors to uncover their biological significance. This exploration should aim to link the observed computational features of the model more directly with biological mechanisms and outcomes.

    4. Author response:

      Reviewer #1 (Public Review):

      Summary:

      This computational modeling study builds on multiple previous lines of experimental and theoretical research to investigate how a single neuron can solve a nonlinear pattern classification task. The authors construct a detailed biophysical and morphological model of a single striatal medium spiny neuron, and endow excitatory and inhibitory synapses with dynamic synaptic plasticity mechanisms that are sensitive to (1) the presence or absence of a dopamine reward signal, and (2) spatiotemporal coincidence of synaptic activity in single dendritic branches. The latter coincidence is detected by voltage-dependent NMDA-type glutamate receptors, which can generate a type of dendritic spike referred to as a "plateau potential." The proposed mechanisms result in moderate performance on a nonlinear classification task when specific input features are segregated and clustered onto individual branches, but reduced performance when input features are randomly distributed across branches. Given the high level of complexity of all components of the model, it is not clear which features of which components are most important for its performance. There is also room for improvement in the narrative structure of the manuscript and the organization of concepts and data.

      To begin with, we will better explain the goal of the study in the introduction and explain that it relies on earlier theoretical work. The goal of the study was to investigate whether and how detailed neuron models with biologically-based morphologies, membrane properties, ion channels, dendritic nonlinearities, and biologically plausible learning rules can quantitatively account for the theoretical results obtained with more abstract models.

      We will further evaluate and clarify the roles of several components in our model regarding their impact on the results. These include a) the role of sufficiently robust and supralinear plateau potentials in computing the NFBP; and b) the importance of metaplasticity for individual synapses, allowing them to start or stop responding to relevant or irrelevant stimuli, respectively, over the training period.

      Strengths:

      The integrative aspect of this study is its major strength. It is challenging to relate low-level details such as electrical spine compartmentalization, extrasynaptic neurotransmitter concentrations, dendritic nonlinearities, spatial clustering of correlated inputs, and plasticity of excitatory and inhibitory synapses to high-level computations such as nonlinear feature classification. Due to high simulation costs, it is rare to see highly biophysical and morphological models used for learning studies that require repeated stimulus presentations over the course of a training procedure. The study aspires to prove the principle that experimentally-supported biological mechanisms can explain complex learning.

      Weaknesses:

      The high level of complexity of each component of the model makes it difficult to gain an intuition for which aspects of the model are essential for its performance, or responsible for its poor performance under certain conditions. Stripping down some of the biophysical detail and comparing it to a simpler model may help better understand each component in isolation. That said, the fundamental concepts behind nonlinear feature binding in neurons with compartmentalized dendrites have been explored in previous work, so it is not clear how this study represents a significant conceptual advance. Finally, the presentation of the model, the motivation and justification of each design choice, and the interpretation of each result could be restructured for clarity to be better received by a wider audience.

      To achieve the goal of the study as described above, we chose to use a biophysically and morphologically detailed neuron model to see if it could quantitatively account for the theoretically-based nonlinear computations, for instance, those discussed in Tran-Van-Minh, A. et al. (2015).

      We will explain the role of each component of the learning rule, as well as the dendritic nonlinearities, for the performance on the NFBP.

      Reviewer #2 (Public Review):

      Summary:

      The study explores how single striatal projection neurons (SPNs) utilize dendritic nonlinearities to solve complex integration tasks. It introduces a calcium-based synaptic learning rule that incorporates local calcium dynamics and dopaminergic signals, along with metaplasticity to ensure stability for synaptic weights. Results show SPNs can solve the nonlinear feature binding problem and enhance computational efficiency through inhibitory plasticity in dendrites, emphasizing the significant computational potential of individual neurons. In summary, the study provides a more biologically plausible solution to single-neuron learning and gives further mechanical insights into complex computations at the single-neuron level.

      Strengths:

      The paper introduces a novel learning rule for training a single multicompartmental neuron model to perform nonlinear feature binding tasks (NFBP), highlighting two main strengths: the learning rule is local, calcium-based, and requires only sparse reward signals, making it highly biologically plausible, and it applies to detailed neuron models that effectively preserve dendritic nonlinearities, contrasting with many previous studies that use simplified models.

      Indeed, the learning rule is local and reward-based, and we will highlight better in the paper that it is “always on”, i.e. there are no separate training and testing phases.

      Weaknesses:

      I am concerned that the manuscript was submitted too hastily, as evidenced by the quality and logic of the writing and the presentation of the figures. These issues may compromise the integrity of the work. I would recommend a substantial revision of the manuscript to improve the clarity of the writing, incorporate more experiments, and better define the goals of the study.

      We will revise the manuscript thoroughly to better present the figures and writing (more detailed below). We will also show supplementary figures showcasing the role of the different components of the learning rule.

      Major Points:

      (1) Quality of Scientific Writing: The current draft does not meet the expected standards. Key issues include:

      i. Mathematical and Implementation Details: The manuscript lacks comprehensive mathematical descriptions and implementation details for the plasticity models (LTP/LTD/Meta) and the SPN model. Given the complexity of the biophysically detailed multicompartment model and the associated learning rules, the inclusion of only nine abstract equations (Eq. 1-9) in the Methods section is insufficient. I was surprised to find no supplementary material providing these crucial details. What parameters were used for the SPN model? What are the mathematical specifics for the extra-synaptic NMDA receptors utilized in this study? For instance, Eq. 3 references [Ca2+]-does this refer to calcium ions influenced by extra-synaptic NMDARs, or does it apply to other standard NMDARs? I also suggest the authors provide pseudocodes for the entire learning process to further clarify the learning rules.

      The detailed setup of the model is described in the referenced papers, including equations and parameter values. The model is downloadable on github. For this reason we did not repeat the information here. That said, we will go through the manuscript and clarify all details, and provide supplemental figures and a GitHub link where necessary for reproducing the results.

      ii. Figure quality. The authors seem not to carefully typeset the images, resulting in overcrowding and varying font sizes in the figures. Some of the fonts are too small and hard to read. The text in many of the diagrams is confusing. For example, in Panel A of Figure 3, two flattened images are combined, leading to small, distorted font sizes. In Panels C and D of Figure 7, the inconsistent use of terminology such as "kernels" further complicates the clarity of the presentation. I recommend that the authors thoroughly review all figures and accompanying text to ensure they meet the expected standards of clarity and quality.

      We will revise the figures for consistency and clarity.

      iii. Writing clarity. The manuscript often includes excessive and irrelevant details, particularly in the mathematical discussions. On page 24, within the "Metaplasticity" section, the authors introduce the biological background to support the proposed metaplasticity equation (Eq. 5). However, much of this biological detail is hypothesized rather than experimentally verified. For instance, the claim that "a pause in dopamine triggers a shift towards higher calcium concentrations while a peak in dopamine pushes the LTP kernel in the opposite direction" lacks cited experimental evidence. If evidence exists, it should be clearly referenced; otherwise, these assertions should be presented as theoretical hypotheses. Generally, Eq. 5 and related discussions should be described more concisely, with only a loose connection to dopamine effects until more experimental findings are available.

      The reviewer is correct; the cited text does not present experimental facts but rather illustrates how the learning rule operates. We will revise the section on the construction of learning rules to clarify which aspects are explicit assumptions and which are experimentally verified. In particular, we will provide a more detailed description and motivation for metaplasticity

      (2) Goals of the Study: The authors need to clearly define the primary objective of their research. Is it to showcase the computational advantages of the local learning rule, or to elucidate biological functions?

      Briefly, the goal of the study was to investigate whether earlier theoretical results with more abstract models can be quantitatively recapitulated in morphologically and biophysically detailed neuron models with dendritic nonlinearities and with biologically based learning rules. (similar response to Summary and Weaknesses to Reviewer #1). We will update the introduction with this information.

      i. Computational Advantage: If the intent is to demonstrate computational advantages, the current experimental results appear inadequate. The learning rule introduced in this work can only solve for four features, whereas previous research (e.g., Bicknell and Hausser, 2021) has shown capability with over 100 features. It is crucial for the authors to extend their demonstrations to prove that their learning rule can handle more than just three features. Furthermore, the requirement to fine-tune the midpoint of the synapse function indicates that the rule modifies the "activation function" of the synapses, as opposed to merely adjusting synaptic weights. In machine learning, modifying weights directly is typically more efficient than altering activation functions during learning tasks. This might account for why the current learning rule is restricted to a limited number of tasks. The authors should critically evaluate whether the proposed local learning rule, including meta-plasticity, actually offers any computational advantage. This evaluation is essential to understand the practical implications and effectiveness of the proposed learning rule.

      As mentioned above, our intent is not to demonstrate the computational advantages of the proposed learning rule but to investigate and illustrate how biophysically detailed neuron models that also display dendritic plateau potential mechanisms, together with biologically-based learning rules, can support the theoretically predicted computational requirements for complex neuronal processing (e.g., Tran-Van-Minh, A. et al., 2015), as well as the results obtained with more abstract neuron models and plateau potential mechanisms (e.g., Schiess et al., 2016; Legenstein and Maass, 2011).

      In the revised manuscript, we will also discuss the differences between the supervised learning rule in Bicknell and Hausser (2021) and our local and reward-based learning rule. We will also show a critical evaluation of how our local learning rule and metaplasticity affect the synaptic weights and why the different components of the rule are needed.

      ii. Biological Significance: If the goal is to interpret biological functions, the authors should dig deeper into the model behaviors to uncover their biological significance. This exploration should aim to link the observed computational features of the model more directly with biological mechanisms and outcomes.

      We will make an attempt to better link the learning rule and dendritic supra-linearities and interpret their biological function.

    1. eLife assessment

      The authors study how cells with lower levels of the conserved steroid hormone signaling component Taiman (tai) are out-competed by neighboring wild-type cells with higher fitness in Drosophila imaginal discs. The findings are useful since they uncover an unexpected link between tai and Wingless signaling in cell competition. The evidence however is incomplete, since the tai loss-of-clone phenotype is based on one allele and the mechanism involved in cell competition through Dlp and Wg lacks adequate supporting data.

    2. Reviewer #1 (Public Review):

      Summary:

      Schweibenz et al are investigating how cells with lower levels of Tai are out-competed by neighboring wild-type (WT) cells. They show that clones homozygous for a tai hypomorphic mutation are disadvantaged and are killed by apoptosis. But tai-low clones are partially rescued when generated in a background that is heterozygous for mutations in apoptotic genes, in the Hippo pathway component warts, or for the Wg/Wnt pathway negative regulator Apc. They then follow up in the link between tai LOF and Wg. The story then shifts away from clones and into experiments that have Tai RNAi depletion or Tai over-expression in the posterior compartment of the wing disc, using the anterior compartment as a control. These non-clonal experiments show that depletion of Tai in the posterior compartment of wing discs results in less Wg in this compartment. This is shown to be due to a reduction in the glypican Dally-like protein (Dlp). The fact that long-range Wg is reduced in tai-depleted discs that also show a reduction in Dlp, suggests that Tai somehow positively promotes Wg distribution. There is some data in the supplementary materials suggesting that Tai promotes dlp mRNA expression but this was not compelling. In fact, the compelling data was that Dlp protein in tai mutant clones is not abundantly on the cell surface, but instead somehow retained in the mutant cell. The authors don't further examine Dlp protein in tai clones. The final figure (Figure 8) shows that there is less Wg at the DV margin in wing discs when tai is depleted from wg-producing cells. In sum, the authors have uncovered some interesting results, but the story has some unresolved issues that, if addressed, could boost its impact. Additionally, the preprint seems to have 2 stories, one about tai and cell competition and the other about tai and Wg distribution. It would be helpful to reorder the figures and improve the narrative so that these are better integrated with each other.

      Strengths:

      The authors are studying competition between tai-low clones and their fitter WT neighbors, and have uncovered an interesting connection to Wg.

      Weaknesses:

      (1) It would be good to know whether the authors can rescue tai-low clones by over-expression UAS-Dlp.

      (2) The data about tai-promoting dlp (Figure S4) is not compelling as there are no biological replicates and no statistical analyses.

      (3) The data on Wg distribution seems disjointed from the data about cell competition. The authors could refocus the paper to emphasize the cell competition story. The role of Dlp in Wg distribution is well established, so the authors could remove or condense these results. The story really could be Figsured 1, 2, 3 and 7 and keep the paper focused on cell competition. The authors could then discuss Dlp as needed for Wg signaling transduction, which is already established in the literature.

      (4) The model of tai controlling dlp mRNA and Dlp protein distribution is confusing. In fact, the data for the former is weak, while the data for the latter is strong. I suggest that the authors focus on the altered Dlp protein distribution on tai-low clones. It would also be helpful to prove the Wg signaling is impeded in tai clones (see #5 below).

      (5) I don't know if the Fz3-RFP reported for Wg signaling works in imaginal discs, but if it does then the authors could make clones in this background to prove that cell-autonomous Wg signaling is reduced in tai-low clones.

    3. Reviewer #2 (Public Review):

      The authors investigate the properties of the transcriptional co-activator Taiman in regulating tissue growth. In previously published work they had shown that cells that overexpress Tiaman in the pupal wing can cause the death of thoracic cells adjacent to the wing tip to die and thus allow the wing to invade the thorax. This was mediated by the secretion of Spz ligands. Here, they investigate the properties of cells that are homozygous for a hypomorphic allele of taiman (tai). They show that homozygous mutant clones are much smaller than their wild-type twin spots and that cells in the clones are dying by apoptosis which is inferred from elevated levels of anti-Dcp1 staining (Figure 1).

      By generating clones during eye development, the authors screen for dominant modifiers that increase the representation of homozygous tai tissue in the adult eye (Figure 2). They find that reducing the levels of hid, the entire rpr/hid/grim locus and Apc (and/or Apc2) each increase the representation of tai clones. They then show that the survival of tissue to the adult stage correlates with the size of lones in the third-instar larval wing disc (Figure 3). The rest of the study derives from the modification of the phenotype by Apc and investigates the interaction between Wnt signaling and tai clone survival.

      The authors then investigate interactions between tai and the wingless (wg) pathway. First, they show that increasing tai expression increases the expression of a wg reporter (nkd-lacZ) while reducing tai levels decreases its expression (Figure 4) indicating that wg signaling is likely reduced when tai levels are decreased. This finding is strengthened by examining wg-lacZ expression since the expression of this reporter is normally restricted to the D/V boundary in the wing disc by feedback inhibition via Wg signaling. Expression of the reporter is increased when tai expression is reduced and decreased when tai expression is increased (Figure 5).

      The authors then look at Wg protein away from the DV boundary. They find increased levels when tai expression is increased and decreased levels when tai is decreased. They conclude that tai activity increased Wg protein in cells (Figure 6). They suggest that this could be the result of the regulation of expression of Dally-like protein (Dlp). Consistent with this idea, increasing tai expression increases Dlp levels, and decreasing tai decreases Dlp levels (Figure 7). They then show that increasing Dlp levels when tai is reduced increases Wg levels which presumably means that Dlp is epistatic to tai. Puzzlingly, increasing both tai and Dlp decreases Wg.

      The authors then examine the effect of reducing Dlp in the cells that secrete Wg. They find that increasing tai results in the diffusion of Wg further from its source while reducing tai reduces its spread (Figure 8). They then show that in clones with reduced tai, there is increased cytoplasmic Dlp (Figure 9). They therefore propose that tai clones fail to survive because they do not secrete enough Dlp which results in reduced capture of the Wg for those cells and hence decreased Wg signaling.

      Evaluation

      While the authors present good evidence in support of most of their conclusions, there are alternative explanations in many cases that have not been excluded.

      From the results in Figure 1 (and Figure 3), the authors conclude that "The data indicate the existence of an extracellular competition mechanism that allows normal tai[wt] cells to kill tai[k15101] neighbors" (line 127). However, the experiments have been done with a single allele, and these experiments do not exclude the possibility that there is another mutation on the same chromosome arm that is responsible for the observed phenotype. Since the authors have a UAS-tai stock, they could strengthen their results using a MARCM experiment where they could test whether the expression of UAS-tai rescues the elimination of tai mutant clones. Alternatively, they could use a second (independent) allele to demonstrate that the phenotype can be attributed to a reduction in tai activity.

      By screening for dominant modifiers of a phenotype one would not expect to identify all interacting genes - only those that are haploinsufficient in this situation. The authors have screened a total of 21 chromosomes for modification and have not really explained which alleles are nulls and which are hypomorphs. The nature of each of the alleles screened needs to be explained better. Also, the absence of a dominant modification does not necessarily exclude a function of that gene or pathway in the process. This is especially relevant for the Spz/Toll pathway which the authors have previously implicated in the ability of tai-overexpressing cells to kill wild-type cells. The most important discovery from this screen is the modification by the Apc alleles. This part of the paper would be strengthened by testing for modification by other components of the Wingless pathway. The authors show modification by Apc[MI01007] and the double mutant Apc[Q8] Apc2[N175A]. Without showing the Apc[Q8] and Apc2[N175A] alleles separately, it is hard to know if the effect of the double mutant is due to Apc, Apc2,` or the combination.

      RNAi of tai seems to block the formation of the Wg gradient. If so, one might expect a reduction in wing size. Indeed, this could explain why the wings of tai/Df flies are smaller. The authors mention briefly that the posterior compartment size is reduced when tai-RNAi is expressed in that compartment. However, this observation merits more emphasis since it could explain why tai/Df flies are smaller (Are their wings smaller?).

      In Figure 7, the authors show the effect of manipulating Tai levels alone or in combination with increasing Dlp levels. However, they do not include images of Wg protein distribution upon increasing Dlp levels alone.

      In Figure 8, there is more Wg protein both at the DV boundary and spreading when tai is overexpressed in the source cells using bbg-Gal4. However, in an earlier experiment (Figure 5C) they show that the wg-lacZ reporter is downregulated at the DV boundary when tai is overexpressed using en-Gal4. They therefore conclude that wg is not transcriptionally upregulated but is, instead secreted at higher levels when tai is expressed in the source cells. Wg protein is reduced in the DV stripe with tai is overexpressed using the en-Gal4 driver (Figure 6B') and is increased at the same location when tai is overexpressed with the bbg-Gal4 driver. (Figure 8) I don't know how to reconcile these observations.

      In Figure 9, the tai-low clones have elevated levels of Dlp. How can this be reconciled with the tai-RNAi knockdown shown in Figure 7C' where reducing tai levels causes a strong reduction in Dlp levels?

    4. Reviewer #3 (Public Review):

      Summary:

      In this study, Schweibenz et al., identify the transcriptional coactivator, Taiman (Tai), as a factor that determines the fitness level of epithelial cells by regulating Wingless (Wg), which is an important determinant of cellular fitness. Taiman determines cellular fitness level by regulating levels of cell-surface glypican Dally-like protein (Dlp), which regulates extracellular Wingless (Wg) distribution. Thus, by affecting levels of Wg via glypican regulation, Tai participates in determining cellular fitness, and cells with low Tai levels are eliminated as they are deprived of adequate Wg levels.

      Strengths:

      (1) The authors make a strong case for the effect of tai on Dlp and Wg levels in experiments where a relatively large group of cells have reduced tai levels.<br /> (2) The claim that tai-low clones are competitively eliminated is supported by experiments that show cell death in them, and their elimination at different time points.<br /> (3) The manuscript is well written.

      Weaknesses:

      (1) The study has relatively weak evidence for the mechanism of cell competition mediated by Dlp and Wg.

      (2) More evidence is required to support the claim that dlp transcription or endocytosis is affected in tai clones.

      Other comments:

      (1) The authors put the study in the context of cell competition, and the first figure indeed is convincing in this regard. However, most of the rest of the study is not in the clonal context, and mainly relies on RNAi KD of tai in the posterior compartment, which is a relatively large group of cells. I understand why the authors chose a different approach to investigate the role of tai in cell competition. However because ubiquitous loss of tai results in smaller organs, it is important to determine to what extent reducing levels of tai in the entire posterior compartment compares with clonal elimination i.e. cell competition. This is important in order to determine to what extent the paradigm of Tai-mediated regulation of Dlp levels and by extension, Wg availability, can be extended as a general mechanism underlying competitive elimination of tai-low clones. If the authors want to make a case for mechanisms involved in the competitive elimination of tai clones, then they need to show that the KD of tai in the posterior compartment shows hallmarks of cell competition. Is there cell death along the A/P boundary? Or is the compartment smaller because those cells are growing slower? Are the levels of Myc/DIAP1, proteins required for fitness, affected in en>tai RNAi cells?

      2) The authors do not have direct/strong evidence of changes in dlp mRNA levels or intracellular trafficking. To back these claims, the authors should look for dlp mRNA levels and provide more evidence for Dlp endocytosis like an antibody uptake assay or at the very least, a higher resolution image analysis showing a change in the number of intracellular Dlp positive punctae. Also, do the authors think that loss of tai increases Dlp endocytosis, making it less available on the cell surface for maintaining adequate extracellular Wg levels?

      3) The data shown in the last figure is at odds with the model (I think) the authors are trying to establish: When cells have lower Tai levels, this reduces Dlp levels (S2) presumably either by reducing dlp transcription and/or increasing (?) Dlp endocytosis. This in turn reduces Wg (availability) in cells away from source cells (Figure 6). The reduced Wg availability makes them less fit, targeting them for competitive elimination. But in tai clones, I do not see any change in cell-surface Dlp (9B) (I would have expected them to be down based on the proposed model). The authors also see more total Dlp (9A) (which is at odds with S2 assuming data in S2 were done under permeabilizing conditions.).

      As a side note, because Dlp is GPI-anchored, the authors should consider the possibility that the 'total' Dlp staining observed in 9A may not be actually total Dlp (and possibly mostly intracellular Dlp, since the permeabilizing membranes with detergent will cause some (most?) Dlp molecules to be lost, and how this might be affecting the interpretation of the data. I think one way to address this would be to process the permeabilized and non-permeabilized samples simultaneously and then image them at the same settings and compare what membrane staining in these two conditions looks like. If membrane staining in the permeabilized condition is decreased compared to non-permeabilized conditions, and the signal intensity of Dlp in permeabilized conditions remains high, then the authors will have evidence to support increased endocytosis in tai clones. Of course, these data will still need to be reconciled with what is shown in S2.

    5. Author response:

      eLife assessment

      “…The evidence however is incomplete, since the tai loss-of-clone phenotype is based on one allele and the mechanism involved in cell competition through Dlp and Wg lacks adequate supporting data.”

      We agree with the need for a second allele and are adding supporting data from a new tai lof allele we have generated by Crispr.

      We also agree that additional functional data would help demonstrate that differences in Dlp levels are required for the mechanism of Tai cell competition. Experiments are ongoing to test whether normalizing Dlp levels across clonal boundaries rescues elimination of Tai-low clones.

      Reviewer #1:

      Overall Statements:

      “There is some data in the supplementary materials suggesting that Tai promotes dlp mRNA expression, but this was not compelling.”

      We are currently testing effects on Tai on dlp and dally transcription using qPCR and reporter transgenes. As noted below, the effects of Tai on Dlp trafficking are ‘strong’, so resolving effects on Dlp transcription will complement this localization data.

      “The authors don't further examine Dlp protein in tai clones.”

      As noted by the Reviewer, we do examine Dlp levels and localization in tai-low clones (see Figure 9), but these experiments are challenging due to their very small size and the hypomorphic nature of the tai allele (tai[k15101]) that was used. Experiments are in progress to examine the effect of our Crispr null allele of tai on Dlp levels and localization in wing clones.

      “In sum, the authors have uncovered some interesting results, but the story has some unresolved issues that, if addressed, could boost its impact. Additionally, the preprint seems to have 2 stories, one about tai and cell competition and the other about tai and Wg distribution. It would be helpful to reorder the figures and improve the narrative so that these are better integrated with each other.”

      We agree. The results of our modifier screen required that we first understand how Tai regulates the Wg pathway before could apply this to understanding the competitive mechanism. Thus, the paper is composed of three sections: 1. the screen, 2. the Tai-Dlp-Wg connection in the absence of competition, and 3. the contribution of Dlp-Wg to the tai[low] ‘loser’ phenotype. These sections use different techniques (e.g., clonal mosaics with genomic alleles, Gal4/UAS and RNAi to define the effect of Tai loss on Wg and Dlp). Ongoing experiments return to clonal mosaics to test whether elevating Dlp can rescue tai lof clones in the same manner as Apc/Apc2 alleles (see Figs. 2-3), which elevate Wg pathway activity.

      Specifics:

      “It would be good to know whether the authors can rescue tai-low clones by over-expression UAS-Dlp.”

      As noted above, experiments are ongoing to test whether normalizing Dlp levels across clonal boundaries rescues elimination of Tai-low clones.

      “The data on Wg distribution seems disjointed from the data about cell competition. The authors could refocus the paper to emphasize the cell competition story. The role of Dlp in Wg distribution is well established, so the authors could remove or condense these results. The story really could be Figs 1, 2, 3 and 7 and keep the paper focused on cell competition. The authors could then discuss Dlp as needed for Wg signaling transduction, which is already established in the literature.”

      We appreciate the suggestion to reorganize the figures to focus the first part of the story on competition, and then follow with the role of Tai in controlling Dlp. We will consider this approach pending the results of ongoing experiments.  

      “The model of tai controlling dlp mRNA and Dlp protein distribution is confusing. In fact, the data for the former is weak, while the data for the latter is strong. I suggest that the authors focus on the altered Dlp protein distribution on tai-low clones. It would also be helpful to prove the Wg signaling is impeded in tai clones (see #5 below).”

      We agree but are currently testing how dlp reporters and mRNA respond to Tai in order to rigorously test a Dlp transcriptional mechanism. To complement the ‘strong’ evidence that Tai regulates Dlp distribution, we are testing Dlp in clones of our Tai Crispr null. Since submission, we have also assessed the effect of blocking the endocytic factor shibire/dynamin in Dlp distribution in Tai deficient cells to complement the data on Pentagone that is already in the paper (see Fig. S3).

      “I don't know if the Fz3-RFP reported for Wg signaling works in imaginal discs, but if it does then the authors could make clones in this background to prove that cell-autonomous Wg signaling is reduced in tai-low clones.”

      We thank the reviewer for this suggestion, which we are now testing.

      Reviewer #2

      Overall Comments:

      “While the authors present good evidence in support of most of their conclusions, there are alternative explanations in many cases that have not been excluded.”

      We appreciate this point and are conducting experiments for a revised submission that will help test alternative mechanisms and clarify our conclusions.

      Specifics:

      “However, the experiments have been done with a single allele, and these experiments do not exclude the possibility that there is another mutation on the same chromosome arm that is responsible for the observed phenotype. Since the authors have a UAS-tai stock, they could strengthen their results using a MARCM experiment where they could test whether the expression of UAS-tai rescues the elimination of tai mutant clones. Alternatively, they could use a second (independent) allele to demonstrate that the phenotype can be attributed to a reduction in tai activity.”

      As noted above, we agree with the need for a second allele and are adding supporting data from a new tai lof allele we have generated by Crispr.

      The tai[k15101] allele acts as a tai hypomorph and has been shown to produce weaker phenotypes than the 61G1 strong lof in a number of papers (Bai et al, 2000; König et al, 2011, Luo et al, 2019, and Zhang et al, 2015). We agree that rescue of tai[k1501] with a UAS-Tai transgene would help rule out effects of second site mutations. We are currently pursuing the reviewer’s second suggestion of phenocopy with a different allele, our new tai Crispr lof.   

      “The authors have screened a total of 21 chromosomes for modification and have not really explained which alleles are nulls and which are hypomorphs. The nature of each of the alleles screened needs to be explained better.”

      We will update the text to better reflect what type of alleles were chosen. In most cases we preferred amorphs or null alleles over hypomorphs, however when the amorph option was not available, we used hypomorphs.

      “Also, the absence of a dominant modification does not necessarily exclude a function of that gene or pathway in the process. This is especially relevant for the Spz/Toll pathway which the authors have previously implicated in the ability of tai-overexpressing cells to kill wild-type cells.”

      We thank the reviewer for this completely accurate point. The dominant screen does not rule out effects of other pathways such as Spz/Toll. Indeed, we were surprised by the lack of dominant effects by Spz/Toll alleles on tai[low] competition given our prior work. The reciprocally clear dominant effect of Apc/Apc2 led us to consider that Wg signaling plays a role in this phenomenon, which then became the starting point of this study.

      “The most important discovery from this screen is the modification by the Apc alleles. This part of the paper would be strengthened by testing for modification by other components of the Wingless pathway. The authors show modification by Apc[MI01007] and the double mutant Apc[Q8] Apc2[N175A]. Without showing the Apc[Q8] and Apc2[N175A] alleles separately, it is hard to know if the effect of the double mutant is due to Apc, Apc2,` or the combination.”

      We agree that testing for modification with other components of the Wg pathway would be helpful to strengthen the connection between Tai low clonal elimination and Wg pathway biology. We also agree that separating Apc [Q8] and Apc2 [N175A] would be a good idea to check if both Apc proteins are equally important for rescuing Tai low cell death, and future experiments for the lab could investigate this distinction.

      “RNAi of tai seems to block the formation of the Wg gradient. If so, one might expect a reduction in wing size. Indeed, this could explain why the wings of tai/Df flies are smaller. The authors mention briefly that the posterior compartment size is reduced when tai-RNAi is expressed in that compartment. However, this observation merits more emphasis since it could explain why tai/Df flies are smaller (Are their wings smaller?).”

      We agree that this is an exciting possibility. Growth effects of Tai linked to interactions with Yorkie and EcR could be due to a distinct role in promoting Wg activity. Alternatively, Tai may cooperate with Yorkie or EcR to control Wg pathway. These are exciting possibilities that we are pursuing in future work

      With regard to the “small size” effect of reducing Tai, we have previously shown that RNAi of Tai using engrailed-Gal4 causes the posterior compartment to shrink (Zhang et al. 2015, Figure 1C-F, H). In this paper, we also showed that tai[k15101]/Df animals are proportionally smaller than wildtype animals and quantified this by measuring 2D wing size (Zhang et al. 2015, Figure 1A and 1B)

      “In Figure 7, the authors show the effect of manipulating Tai levels alone or in combination with increasing Dlp levels. However, they do not include images of Wg protein distribution upon increasing Dlp levels alone.”

      We thank the reviewer for this reminder and have already generated these control images to include in a revised submission paper.

      “In Figure 8, there is more Wg protein both at the DV boundary and spreading when tai is overexpressed in the source cells using bbg-Gal4. However, in an earlier experiment (Figure 5C) they show that the wg-lacZ reporter is downregulated at the DV boundary when tai is overexpressed using en-Gal4. They therefore conclude that wg is not transcriptionally upregulated but is, instead secreted at higher levels when tai is expressed in the source cells. Wg protein is reduced in the DV stripe with tai is overexpressed using the en-Gal4 driver (Figure 6B') and is increased at the same location when tai is overexpressed with the bbg-Gal4 driver. (Figure 8) I don't know how to reconcile these observations.”

      We thank the reviewer for pressing us to develop an overall model explaining our results and how we envision Tai regulating Dlp and Wg. We are preparing a graphic abstract that illustrates this model and will be included in our revision.

      Briefly, we favor a model in which Tai controls the rate of Wg spread via Dlp, without a significant effect on wg transcription. For example, the induction of Dlp across the ‘engrailed’ domain of en>Tai discs (Fig 7B-B”) allows Wg to spread rapidly across the flanks and moderately depletes it from the DV margin (Fig 6B-B”) as noted by the reviewer. Adding a UAS-Dlp transgene in the en>Tai background dramatically accelerates Wg spread and causes it to be depleted from the DV margin and build up at the far end of the gradient adjacent to the dorsal and ventral hinge. Significantly blocking endocytosis of Wg in en>Tai discs with a dominant negative shibire transgene also causes Wg to build up in the same location (new data to be added in a revision) consistent with enhanced spreading. The difference in the bbg-Gal4 experiment is that Tai is only overexpressed in DV margin cells, which constrains and concentrates Wg within this restricted domain; we are in the process of testing whether this effect on Wg is blocked by RNAi of Dlp in bbg>Tai discs.

      “In Figure 9, the tai-low clones have elevated levels of Dlp. How can this be reconciled with the tai-RNAi knockdown shown in Figure 7C' where reducing tai levels causes a strong reduction in Dlp levels?”

      We apologize for not explaining this data well enough. First, the tai[k15101] allele is a weak, viable hypomorph (as shown in our Zhang et al, 2015 paper) whereas the Tai RNAi line is lethal with most drivers (including en-Gal4) and thus a stronger lof. Second, Tai RNAi lower Dlp levels (Fig 7C) while tai[k15101] causes Dlp to accumulate intracellularly (see Fig. 9A-C). These data indicate that reduced Tai leads to a defect in Dlp intracellular trafficking while its loss reduces Dlp overall levels; these data can be explained by a single role for Tai in Dlp traffic to or from the cell membrane, or two roles, one in trafficking and one Dlp expression. As noted, we are investigating both possibilities using dlp reporter lines and our new tai null Crispr allele.

      Reviewer #3:

      Overall Weaknesses:

      “The study has relatively weak evidence for the mechanism of cell competition mediated by Dlp and Wg.”

      The screen and middle section of the paper provide genetic evidence that elevating Wg pathway activity rescues Tai[low} loser cells and that Tai controls levels/localization of Dlp and distribution of Wg in the developing wing disc. Our current work is focused on linking these two finding together in Tai “loser” clones.

      “More evidence is required to support the claim that dlp transcription or endocytosis is affected in tai clones.”

      As noted above, we are testing whether normalizing Dlp levels across clonal boundaries rescues tai[low] loser clones and assessing effects of Tai on dlp transcription and Dlp trafficking.

      Specifics:

      “Most of the rest of the study is not in the clonal context, and mainly relies on RNAi KD of tai in the posterior compartment, which is a relatively large group of cells. I understand why the authors chose a different approach to investigate the role of tai in cell competition. However because ubiquitous loss of tai results in smaller organs, it is important to determine to what extent reducing levels of tai in the entire posterior compartment compares with clonal elimination i.e. cell competition. This is important in order to determine to what extent the paradigm of Tai-mediated regulation of Dlp levels and by extension, Wg availability, can be extended as a general mechanism underlying competitive elimination of tai-low clones. If the authors want to make a case for mechanisms involved in the competitive elimination of tai clones, then they need to show that the KD of tai in the posterior compartment shows hallmarks of cell competition. Is there cell death along the A/P boundary? Or is the compartment smaller because those cells are growing slower?”

      Based on data that cell competition does not occur over compartment boundaries (e.g., see review by L.A. Johnston, Science, 2009), we chose not to use UAS-Gal4 to assess competition, but rather to investigate underlying biology occurring between Tai, Wg, and Dlp.

      “Are the levels of Myc/DIAP1, proteins required for fitness, affected in en>tai RNAi cells?”

      This is, of course, an interesting question given that Myc is a well-studied competition factor and is proposed to be downstream of the Tai-interacting protein Yki. We are not currently focused on Myc, but plan to test its role in the Tai-Dlp-Wg pathway in future work.

      “The authors do not have direct/strong evidence of changes in dlp mRNA levels or intracellular trafficking. To back these claims, the authors should look for dlp mRNA levels and provide more evidence for Dlp endocytosis like an antibody uptake assay or at the very least, a higher resolution image analysis showing a change in the number of intracellular Dlp positive punctae. Also, do the authors think that loss of tai increases Dlp endocytosis, making it less available on the cell surface for maintaining adequate extracellular Wg levels?”

      As noted above, have added experiments using a dominant-negative shibire/dynamin allele to test whether Tai controls Dlp endocytosis. These data will be added to a revised manuscript. We have also gathered reagents to test effects of Tai gain/loss on Dlp secretion.

      “The data shown in the last figure is at odds with the model (I think) the authors are trying to establish: When cells have lower Tai levels, this reduces Dlp levels (S2) presumably either by reducing dlp transcription and/or increasing (?) Dlp endocytosis. This in turn reduces Wg (availability) in cells away from source cells (Figure 6). The reduced Wg availability makes them less fit, targeting them for competitive elimination. But in tai clones, I do not see any change in cell-surface Dlp (9B) (I would have expected them to be down based on the proposed model). The authors also see more total Dlp (9A) (which is at odds with S2 assuming data in S2 were done under permeabilizing conditions.).”

      As noted above (under Rev #2 comments), we apologize for not explaining this data well enough. First, the tai[k15101] allele is a weak, viable hypomorph (as shown in our Zhang et al, 2015 paper) whereas the Tai RNAi line is lethal with most drivers (including en-Gal4) and thus a stronger lof. Second, Tai RNAi lower Dlp levels (Fig 7C) while tai[k15101] causes Dlp to accumulate intracellularly (see Fig. 9A-C). These data indicate that reduced Tai leads to a defect in Dlp intracellular trafficking while its loss reduces Dlp overall levels; these data can be explained by a single role for Tai in Dlp traffic to or from the cell membrane, or two roles, one in trafficking and one Dlp expression. We are investigating both possibilities using dlp reporter lines and our new tai null Crispr allele.

      “As a side note, because Dlp is GPI-anchored, the authors should consider the possibility that the 'total' Dlp staining observed in 9A may not be actually total Dlp (and possibly mostly intracellular Dlp, since the permeabilizing membranes with detergent will cause some (most?) Dlp molecules to be lost, and how this might be affecting the interpretation of the data. I think one way to address this would be to process the permeabilized and non-permeabilized samples simultaneously and then image them at the same settings and compare what membrane staining in these two conditions looks like. If membrane staining in the permeabilized condition is decreased compared to non-permeabilized conditions, and the signal intensity of Dlp in permeabilized conditions remains high, then the authors will have evidence to support increased endocytosis in tai clones. Of course, these data will still need to be reconciled with what is shown in S2.

      We thank the reviewer for this excellent suggestion and are generating mosaic discs to test the proposed approach of synchronous analysis of total vs. intracellular Dlp.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      (1) A problem with in vitro work is that homogeneous cell lines/cultures are, by nature, absent from the rest of the microenvironment. The authors need to discuss this. 

      We have added two sentences to the second paragraph of the Discussion section in which we now acknowledge this concern, but also point out that in vitro models of this sort also provide an experimental advantage in that they facilitate a deconvolution of the extensive complexity resident within the intact animal. Nevertheless, we acknowledge that this deconvolution requires ultimate validation of findings obtained within an in vitro model system to ensure they accurately recapitulate functions that occur in the intact animal in vivo.

      (2) What are n's/replicates for each study? Were the same or different samples used to generate the data for RNA sequencing, methylation beadchip analysis, and EM-seq? This clarification is important because if the same cultures were used, this would allow comparisons and correlations within samples.  

      Additional text has been added in the Methods section to indicate that all samples involving cell culture models which include iPSCs and PGCLCs came from a single XY iPS cell line aliquoted into replicates and all primary cultures which included Sertoli and granulosa cells were generated from pooled tissue preps from mice and then aliquoted into replicates. Finally, all experiments in the study were performed on three replicates. Because this experimental design did indeed allow for comparisons among samples, we have added a new Supplement figure 9 which displays PCA plots showing clustering among control and treatment datasets, respectively, as well as distinctions between each cluster representing each experimental condition.

      (3) In Figure 1, it is interesting that the 50 uM BPS dose mainly resulted in hypermethylation whereas 100 uM appears to be mainly hypomethylation. (This is based on the subjective appearance of graphs). The authors should discuss and/or present these data more quantitatively. For example, what percentage of changes were hypo/hypermethylation for each treatment? How many DMRs did each dose induce? For the RNA-seq results, again, what were the number of up/down-regulated genes for each dose?  

      The experiment shown in Figure 1 was designed to 1) serve as proof of principle that cells maintained in culture could be susceptible to EDC-induced epimutagenesis at all, 2) determine if any response observed would be dose-dependent, and 3) identify a minimally effective dose of BPS to be used for the remaining experiments in this study (which we identified as 1 μM). We agree that it is interesting that the 50 µM dose of BPS induced predominantly hypermethylation changes whereas the 1 µM and 100 µM doses induced predominantly hypomethylation changes, but are not in a position to offer a mechanistic explanation for this outcome at this time. As the results shown satisfied our primary objectives of demonstrating that exposure of cells in culture to BPS could indeed induce DNA methylation epimutations, that this occurs in a dose-dependent manner, and that a dose of as low as 1 µM of BPS was sufficient to induce epimutagenesis, the data obtained satisfied all of the initial objectives of this experiment. That said, in response to the reviewer’s request we have now added text on pages 6-7 alluding to new Supplemental tables 1-3 indicating the total number of DMCs and DMRs, as well as the number of DEGs, detected in response to exposure to each dose of BPS shown in Figure 1, as well as stratifying those results to indicate the numbers of hyper- and hypomethylation epimutations and up- and down-regulated DEGs induced in response to each dose of BPS. While, as noted above, investigating the mechanistic basis for the difference in responses induced by the 50 µM versus 1 and 100 µM doses of BPS was beyond the scope of the study presented in this manuscript, we do find this result reminiscent of the “U-shaped” response curves often observed in toxicology studies. Importantly, this result does demonstrate the elevated resolution and specificity of analysis facilitated by our in vitro cell culture model system.

      (4) Also in Figure 1, were there DMRs or genes in common across the doses? How did DMRs relate to gene expression results? This would be informative in verifying or refuting expectations that greater methylation is often associated with decreased gene expression.  

      In general, we observed a coincidence between changes in DNA methylation and changes in gene expression (Supplement Tables 1-3). Pertaining directly to the reviewer’s question about the extent to which we observed common DMRs and DEGs across all doses, while we only found 3 overlapping DMRs conserved across all doses tested, we did find an average of 51.25% overlap in DMCs and an average of 80.45% overlap in DEGs across iPSCs exposed to the different doses of BPS shown in Figure 1. In addition, within each dose of BPS tested in iPSCs, we also found that there was an overlap between DMCs and the promoters or gene bodies of many DEGs (Supplement Table 4). Specifically within gene promoters, we observed a correlation between hypermethylated DMCs and decreased gene expression and hypomethylated DMCs and increased gene expression, respectively (Supplement Figure 2).

      (5) In Figure 2, was there an overlap in the hypo- and/or hyper-methylated DMCs? Please also add more description of the data in 2b to the legend including what the dot sizes/colors mean, etc. Some readers (including me) may not be familiar with this type of data presentation. Some of this comes up in Figure 4, so perhaps allude to this earlier on, or show these data earlier.  

      We observed an average of 11.05% overlapping DMCs between different pairs of cell types, we did not observe any DMCs that were shared among all four cell types. Indeed, this limited overlap of DMCs among different cell types exposed to BPS was the primary motivation for the analysis described in Figure 2. Thus, instead of focusing solely on direct overlap between specific DMCs, we instead examined similarities among the different cell types tested in the occurrence of epimutations within different annotated genomic regions. To better describe this, we have now added additional text to page 9. We have also added more detail to the legend for Figure 2 on page 8 to more clearly explain the significance of the dot sizes and colors, explaining that the dot sizes are indicative of the relative number of differentially methylated probes that were detected within each specific annotated genomic region, and that the dot colors are indicative of the calculated enrichment score reflecting the relative abundance of epimutations occurring within a specific annotated genomic region. The relative score is calculated by iterating down the list of DMCs and increasing a running-sum statistic when encountering a DMC within the specific annotated genomic region of interest and decreasing the sum when the epimutation is not in that annotated region. The magnitude of the increment depends upon the relative occurrence of DMCs within a specific annotated genomic region.

      (6) iPSCs were derived from male mice MEFs, and subsequently used to differentiate into PGCLCs. The only cell type from an XX female is the granulosa cells. This might be important, and should be mentioned and its potential significance discussed (briefly).  

      We have added a new paragraph just before the final paragraph of the Discussion section in which we acknowledge that most of the cell types analyzed during our study were XY-bearing “male” cells and that the manner in which XX-bearing “female” cells might respond to similar exposures could differ from the responses we observed in XY cells. However, we also noted that our assessment of XX-bearing granulosa cells yielded results very similar to those seen in XY Sertoli cells suggesting that, at least for differentiated somatic cell types, there does not appear to be a significant sex-specific difference in response to exposure to a similar dose of the same EDC. That said, we also acknowledged that in cell types in which dosage compensation based on X-chromosome inactivation is not in place, differences between XY- and XX-bearing cells could accrue.

      (7) EREs are only one type of hormone response element. The authors make the point that other mechanisms of BPS action are independent of canonical endocrine signaling. Would authors please briefly speculate on the possibility that other endocrine pathways including those utilizing AREs or other HREs may play a role? In other words, it may not be endocrine signaling independent. The statement that the differences between PGCLCs and other cells are largely due to the absence of ERs is overly simplistic.  

      Previous reports have indicated that BPS does not have the capacity to bind with the androgen receptor (Pelch et al., 2019; Yang et al., 2024). However there have been reports indicating that BPS can interact with other endocrine receptors including PPARγ and RXRα, which play a role in lipid accumulation and the potential to be linked to obesity phenotypes (Gao et al., 2020; Sharma et al., 2018). To address the reviewer’s comment we assessed the expression of a panel of hormone receptors including PPARγ, RXRα, and AR  in each of the cell types examined in our study and these results are now shown in a new Supplent Figure 4. We show that in addition to not expressing either estrogen receptor (ERa or ERb), germ cells also do not express any of the other endocrine receptors we tested including AR, PPARγ, and RXRα. Thus we now note that these results support our suggestion that the induction of epimutations we observed in germ cells in response to exposure to BPS appears to reflect disruption of non-canonical endocrine signaling. We also note that non-canonical endocrine signaling is well established (Brenker et al., 2018; Ozgyin et al., 2015; Song et al., 2011; Thomas and Dong, 2006). Thus we feel the suggestion that the effects of BPS exposure could conceivably reflect either disruption of canonical or non-canonical signaling in any cell type is well justified and that our data suggests that both of these effects appear to have accrued in the cells examined in our study as suggested in the text of our manuscript.

      (8) Interpretation of data from the GO analysis is similarly overly simplistic. The pathways identified and discussed (e.g. PI3K/AKT and ubiquitin-like protease pathways) are involved in numerous functions, both endocrine and non-endocrine. Also, are the data shown in Figure 6a from all 4 cell types? I am confused by the heatmap in 6c, which genes were significantly affected by treatment in which cell types?  

      Per the reviewer’s request, we have added text to indicate that Figure 6a is indeed data from all four cell types examined. We have also modified the text to further clarify that Figure 6c displays the expression of other G-coupled protein receptors which are expressed at similar, if not higher, levels than either ER in all cell types examined, and that these have been shown to have the potential to bind to either 17β-estradiol or BPA in rat models. As alluded to by the reviewer, this is indicative of a wide variety of distinct pathways and/or functions that can potentially be impacted by exposure to an EDC such as BPS. Thus, we have attempted to acknowledge the reviewer’s primary point that BPS may interact with a variety of receptors or other factors involved with a wide variety of different pathways and functions. Importantly, this illustrates the strength of our model system in that it can be used to identify potential impacted target pathways that can then be subsequently pursued further as deemed appropriate.

      (9) In Figure 7, what were the 138 genes? Any commonalities among them? 

      We have now added a new supplemental Excel file that lists the 138 overlapping conserved DEGs that did not become reprogrammed/corrected during the transition from iPSCs to PGCLCs. In addition, we have added new text on page 22 and a new Supplemental Figure 8 which displays KEGG analysis of pathways associated with these 138 retained DEGs. We find that these genes are primarily involved with cell cycle and apoptosis pathways which, interestingly, have the potential to be linked to cancer development which is often linked to disruptions in chromatin architecture.

      (10) The Introduction is very long. The last paragraph, beginning line 105, is a long summary of results and interpretations that better fit in a Discussion section.

      We have now significantly reduced the length and scope of the final paragraph of the Introduction per the reviewer’s recommendation.

      (11) Provide some details on husbandry: e.g. were they bred on-site? What food was given, and how was water treated? These questions are to get at efforts to minimize exposure to other chemicals.  

      We have added additional text detailing that all mice used in the project were bred onsite, water was non-autoclaved conventional RO water, and our selection of 5V5R extruded feed for mice used in this study which was highly controlled for the presence of isoflavones and has been certified to be used for estrogen-sensitive animal protocols.

      Reviewer #2 (Public Review):

      Summary:

      This manuscript uses cell lines representative of germ line cells, somatic cells, and pluripotent cells to address the question of how the endocrine-disrupting compound BPS affects these various cells with respect to gene expression and DNA methylation. They find a relationship between the presence of estrogen receptor gene expression and the number of DNA methylation and gene expression changes. Notably, PGCLCs do not express estrogen receptors and although they do have fewer changes, changes are nevertheless detected, suggesting a nonconical pathway for BPS-induced perturbations. Additionally, there was a significant increase in the occurrence of BPS-induced epimutations near EREs in somatic and pluripotent cell types compared to germ cells. Epimutations in the somatic and pluripotent cell types were predominantly in enhancer regions whereas that in the germ cell type was predominantly in gene promoters.

      Strengths:

      The strengths of the paper include the use of various cell types to address the sensitivity of the lineages to BPS as well as the observed relationship between the presence of estrogen receptors and changes in gene expression and DNA methylation.

      Weaknesses:

      The weaknesses include the lack of reporting of replicates, superficial bioinformatic analysis, and the fact that exposures are more complicated in a whole organism than in an isolated cell line.

      Recommendations for the authors:

      Reviewer #2 (Recommendations For The Authors):

      Overall, this is an intriguing paper but more transparency in the replicates and methods and a more rigorous bioinformatic treatment of the data are required.

      Specific comments:

      (1) End of abstract "These results suggest a unique mechanism by which an EDC-induced epimutated state may be propagated transgenerationally following a single exposure to the causative EDC." This is overly speculative for an abstract. There is only epigenetic inheritance following mitosis or differentiation presented in this study. There is no meiosis and therefore no ability to assess multi- or transgenerational inheritance. 

      We have modified the text at the end of the abstract to more precisely reflect our intended conclusions based on our data. In our view, the ability of induced epimutations to transcend meiosis per se is not as relevant to the mechanism of transgenerational inheritance as their ability to transcend major waves of epigenetic reprogramming that normally occur during development of the germ line. In this regard the transition from pluripotent iPSCs to germline PGCLCs has been shown to recapitulate at least the first portion of normal germline reprogramming, and now our data provide novel insight into the fate of induced epimutations during this process. Specifically, we show that a prevelance of epimutations was conserved during the iPSC à germ cell transition but that very few (< 5%) of the specific epimutations present in the the BPS-exposed iPSCs were retained when those cells were induced to form PGCLCs. Rather, we observed apparent correction of a large majority of the initially induced epimutations during this transition, but this was accompanied by the apparent de novo generation of novel epimutations in the PGCLCs. We suggest, based on other recent reports in the literature, that this is a result of the BPS exposure inducing changes in the chromatin architecture in the exposed iPSCs such that when the normal germline reprogramming mechanism is imposed on this disrupted chromatin template there is both correction of many existing epimutations and the genesis of many novel epimutations. This observation has the potential to explain the long-standing question of why the prevalence of epimutations persists across multiple generations despite the occurrence of epigenetic reprogramming during each generation. Nevertheless, as noted above, we have modified the text at the end of the abstract to temper this interpretation given that it is still somewhat speculative at this point.

      (2) Doses used in the experiments. One needs to be careful when stating that the dose used is "below FDA's suggested safe environmental level established for BPA" because a different bisphenol is being used here (BPA vs BPS) and the safe level is that which the entire organism experiences. It is likely that cell lines experience a higher effective dose.  

      We have now made a point of noting that our reference to an EPA-recommended “safe dose” of BPA was for humans and/or intact animals. Changes to this effect have been made in the second and sixth paragraphs of the Introduction section. In addition, we have added text at the end of the fourth paragraph of the Discussion section acknowledging that, as the reviewer suggests, the same dose of an EDC could exert greater effects on cells in a homogeneous culture than on the same cell type within an intact animal given the potential for mitigating metabolic effects in the latter. However, we also note that the ability we demonstrated to quantify the effects of such exposures on the basis of numbers of epimutations (DMCs or DMRs) induced could potentially be used in future studies to study this question by assessing the effects of a specific dose of a specific EDC on a specific cell type when exposed either within a homogeneous culture or within an intact animal.

      (3) Figure 1: In the dose response, what was the overlap in DMCs and DEGs among the 3 doses? Are the responses additive, synergistic, or completely non-overlapping? This is an important point that should be addressed. 

      Please see our response to Reviewer 1 critique #4 above where we address similar concerns. While we do find overlap among different cell types with respect to the DMCs, DMRs, and DEGs displayed in Figure 1, we found the effect to be only partially additive as opposed to synergistic in any apparent manner. The fold increase in DMCs, DMRs, and DEGs resulting from exposure to doses of 1 μM or 50 μM ranged from 2.5x to 4.4x, which was well below the 50x increase that would have been expected from a strictly additive effect, and the effect increased even less, if at all, in response to exposure to doses of 50 μM versus 100 μM BPS. Finally, as now noted in the Discussion section on page 25, our conclusion is that these results display a limited dose-dependent effect that was partially additive but also plateaued at the highest doses tested.

      (4) Methods: How many times was each exposure performed on a given cell type? This information should be in the figure legends and methods. In the case of multiple exposures for a given line, do the biological replicates agree? 

      Please see our response to Reviewer 1 critique #2 where we address similar concerns with newly added text and analysis. We now note repeatedly on pages 39-45 that each analysis was conducted on three replicate samples, and we display the similarity among those replicates graphically in a new Supplement Figure 9.

      (5) DNA methylation analyses. Very little analysis is presented on the BeadChip array other than hypermethylated/hypomethylated and genomic regions of DMCs. What is the range of methylation changes? Does it vary between hypo vs. hyper DMCs? How many array experiments were performed (biological replicates) and what stats were used to determine the DMCs? Are there DMCs in common among the various cell types? As an example, if more meaningful analysis, one can plot the %5mC over a given array for comparisons between control and treated cell types. For more granularity, the %5mC can be presented according to the element type (enhancers vs promoters). 

      Please see our response to Reviewer 1 critique #2 above where we address similar concerns regarding the number of biological replicates used in this study. DMCs on the Infinium array are identified using mixed linear models. This general supervised learning framework identifies CpG loci at which differential methylation is associated with known control vs. treated co-variates. CpG probes on the array were defined as having differential changes that met both p-value and FDR (≤ 0.05) significant thresholds between treatment and control samples for each cell type analyzed. The range of medians across all samples was 0.0278 to 0.0059 for hypermethylated beta values and -0.0179 to -0.0033 for hypomethylated beta values. As noted above, we did observe an overlap in DMCs between cell types. Thus, we observed an average of 11.05% overlapping DMCs between two or more cell types but we did not observe any DMCs shared between all four cell types. We have added additional text on page 9 and new Supplement Tables 1-4 and Supplement Figure 1 to now more clearly describe that this limited similarity in direct overlap of DMCs was the underlying motivation for the analysis described in Figure 2. Finally, the enrichment dot plots shown in Figure 2 provide the information the reviewer requested regarding the %5mC observed at different annotated genomic element types.

      (6) The investigators correlate the number of DMCs in a given cell type with the presence of estrogen receptors. Does the correlation extend to the methylation difference (delta beta) at the statistically different probes?

      We have added a new Supplement Figure 3 in which we provide data addressing this question. In brief, we find that the delta betas of probes enriched at enhancer regions and associated with relative proximity to ERE elements in Sertoli cells, granulosa cells, and iPSCs appear very similar to those associated with DMCs not located within these enriched regions. However, when we compared the similarity of the two data sets with goodness of fit tests, we found these relatively small differences were, in fact, statistically significant based on a two-sample Kolmogorov-Smirnov test. These observed significant differences appear to indicate that there is higher variability among the delta betas associated with hypomethylated, but not hypermethylation changes occurring at DMCs associated with enhancers, potentially suggesting a greater tendency for exposure to BPS to induce hypomethylation rather than hypermethylation changes, at least in these specific regions.

      (7) Methylation changes relative to EREs are presented in multiple figures. Are other sequences enriched in the DMCs? 

      We profiled the genomic sequence within 500 bp of cell type-specific enriched DMCs that were either associated with enhancer regions in Sertoli, granulosa, or iPS cells or transcription factor binding sites in PGCLCs for the identification of higher abundance motif sequences. We then compared any motifs identified with the JASPAR database to potentially find transcription factors that could be binding to these regions. Interestingly we found that the two most common motifs across all cell types were associated with either the chromatin remodeling transcription factor HMG1A or the pluripotency factor KLF4.

      (8) Please present a correlation plot between the methylation differences and the adjacent DEGs. Again, the absence of consideration of the absolute changes in methylation and gene expression minimizes the impact of the data. 

      We analyzed the relationship between DMCs at DEGs promoter regions and the corresponding change in expression of that DEG. Our data support a relationship between up-regulated genes showing decreased methylation in promoter regions and down-regulated genes showing increased methylation at promoter regions, although there were some exceptions to this relationship.

      (9) EM-Seq is mentioned in Figure 7 and in the material and methods. Where is it used in this study? 

      We now note in the text on page 22 that EM-seq was used during experiments assessing the propagation of BPS-induced epimutations during the iPSC à EpiLC à PGCLC cell state transitions to gather higher resolution data of changes to DNA methylation differences at the whole-epigenome level.

      References

      Brenker C, Rehfeld A, Schiffer C, Kierzek M, Kaupp UB, Skakkebæk NE, Strünker T. 2018. Synergistic activation of CatSper Ca2+ channels in human sperm by oviductal ligands and endocrine disrupting chemicals. Hum Reprod 33:1915–1923. doi:10.1093/humrep/dey275

      Gao P, Wang L, Yang N, Wen J, Zhao M, Su G, Zhang J, Weng D. 2020. Peroxisome proliferator-activated receptor gamma (PPARγ) activation and metabolism disturbance induced by bisphenol A and its replacement analog bisphenol S using in vitro macrophages and in vivo mouse models. Environ Int 134. doi:10.1016/J.ENVINT.2019.105328

      Ozgyin L, Erdos E, Bojcsuk D, Balint BL. 2015. Nuclear receptors in transgenerational epigenetic inheritance. Prog Biophys Mol Biol. doi:10.1016/j.pbiomolbio.2015.02.012

      Pelch KE, Li Y, Perera L, Thayer KA, Korach KS. 2019. Characterization of Estrogenic and Androgenic Activities for Bisphenol A-like Chemicals (BPs): In Vitro Estrogen and Androgen Receptors Transcriptional Activation, Gene Regulation, and Binding Profiles. Toxicol Sci 172:23–37. doi:10.1093/TOXSCI/KFZ173

      Sharma S, Ahmad S, Khan MF, Parvez S, Raisuddin S. 2018. In silico molecular interaction of bisphenol analogues with human nuclear receptors reveals their stronger affinity vs. classical bisphenol A. Toxicol Mech Methods 28:660–669. doi:10.1080/15376516.2018.1491663

      Song K-H, Lee K, Choi H-S. 2011. Endocrine Disrupter Bisphenol A Induces Orphan Nuclear Receptor Nur77 Gene Expression and Steroidogenesis in Mouse Testicular Leydig Cells. Endocrinology 143:2208–2215. doi:10.1210/endo.143.6.8847

      Thomas P, Dong J. 2006. Binding and activation of the seven-transmembrane estrogen receptor GPR30 by environmental estrogens: A potential novel mechanism of endocrine disruption. J Steroid Biochem Mol Biol 102:175–179. doi:10.1016/j.jsbmb.2006.09.017

      Yang Z, Wang L, Yang Y, Pang X, Sun Y, Liang Y, Cao H. 2024. Screening of the Antagonistic Activity of Potential Bisphenol A Alternatives toward the Androgen Receptor Using Machine Learning and Molecular Dynamics Simulation. Environ Sci Technol 58:2817–2829. doi:10.1021/ACS.EST.3C09779/ASSET/IMAGES/LARGE/ES3C09779_0004.JPEG

    2. eLife assessment

      The findings are valuable and supported by compelling evidence from deep sequencing and bioinformatics analyses. The strength of this work lies in the comprehensive analysis of different cells representing various life stages, exposing vulnerabilities to EDCs and relating epimutations to specific genomic regulatory regions. Despite the small sample size, the results make a major contribution to the field and provide novel insight into the emergence and correction of epimutations during epigenetic programming and into the processes underlying multigenerational effects of EDCs.

    3. Reviewer #1 (Public Review):

      In this revised manuscript, authors have conducted epigenetic and transcriptomic profiling to understand how environmental chemicals such as BPS can cause epimutations that can propagate to future generations. They used isolated somatic cells from mice (Sertoli, granulosa), pluripotent cells to model preimplantation embryos (iPSCs) and cells to model the germline (PGCLCs). This enabled them to model sequential steps in germline development, and when/how epimutations occur. The major findings were that BPS induced unique epimutations in each cell type, albeit with qualitative and quantitative cell-specific differences; that these epimutations are prevalent in regions associated with estrogen-response elements (EREs); and that epimutations induced in iPSCs are corrected as they differentiate into PGCLCs, concomitant with the emergence of de novo epimutations. This study will be useful in understanding multigenerational effects of EDCs, and underlying mechanisms.

      Strengths include:

      (1) Using different cell types representing life stages of epigenetic programming and during which exposures to EDCs have different effects. This progression revealed information both about correction of epimutations and the emergence of new ones in PGCLCs.<br /> (2) Work conducted by exposing iPSCs to BPS or vehicle, then differentiating to PGCLCs, revealed that novel epimutations emerged.<br /> (3) Relating epimutations to promoter and enhancer regions

      A few weaknesses remain: Authors need to discuss the limitations of the small sample size. The supplemental data, while extremely helpful, requires better organization.

    4. Reviewer #2 (Public Review):

      Summary:

      This manuscript uses cell lines representative of germ line cells, somatic cells and pluripotent cells to address the question of how the endocrine disrupting compound BPS affects these various cells with respect to gene expression and DNA methylation. They find a relationship between the presence of estrogen receptor gene expression and the number of DNA methylation and gene expression changes. Notably, PGCLCs do not express estrogen receptors and although they do have fewer changes, changes are nevertheless detected, suggesting a nonconical pathway for BPS-induced perturbations. Additionally, there was a significant increase in the occurrence of BPS-induced epimutations near EREs in somatic and pluripotent cell types compared to germ cells. Epimutations in the somatic and pluripotent cell types were predominantly in enhancer regions whereas that in the germ cell type was predominantly in gene promoters.

      Strengths:

      The strengths of the paper include the use of various cell types to address sensitivity of the lineages to BPS as well as the observed relationship between the presence of estrogen receptors and changes in gene expression and DNA methylation.

      Weaknesses:

      The weakness includes the fact that exposures are more complicated in a whole organism than in an isolated cell line.

    1. Author response:

      eLife assessment

      This manuscript reports an important finding that the transcription factor Scleraxis regulates regenerative myogenesis by controlling the proliferation and differentiation of muscle stem cells. The evidence presented is compelling and supports the conclusions and the mechanisms by which this gene regulates satellite cell function. These data will be of interest to developmental, transcriptional, and stem cell biologists.

      Public Reviews:

      Reviewer #1 (Public Review):

      This manuscript by Bai et al concerns the expression of Scleraxis (Scx) by muscle satellite cells (SCs) and the role of that gene in regenerative myogenesis. The authors report the expression of this gene associated with tendon development in satellite cells. Genetic deletion of Scx in SCs impairs muscle regeneration, and the authors provide evidence that SCs deficient in Scx are impaired in terms of population growth and cellular differentiation. Overall, this report provides evidence of the role of this gene, unexpectedly, in SC function and adult regenerative myogenesis.

      We appreciate the comments and thank her/him for the support of our manuscript.

      There are a few minor points of concern.

      (1) From the data in Figure 1, it appears that all of the SCs, assessed both in vitro and in vivo, express Scx. The authors refer to a scRNA-seq dataset from their lab and one report from mdx mouse muscle that also reveals this unexpected gene expression pattern. Has this been observed in many other scRNA-seq datasets? If not, it would be important to discuss potential explanations as to why this has not been reported previously.

      Thanks for this question regarding data in Figure 1. We did initially use immunofluorescence staining of Pax7 and GFP on muscle sections and primary myoblast cultures prepared from Tg-ScxGFP mice to conclude that Scx was expressed in satellite cells (SCs). In addition to the cited mdx RNA-seq data, we have included a re-analysis of a published scRNA-seq data set in Figure 2E (Dell'Orso, Juan et al., Development, 2019), and our own scRNA-seq data (Figure S5D, F). We have also re-examined an additional scRNA-seq data set of TA muscles at various regeneration time points (De Micheli et al., Cell Rep. 2020), in which Scx expression was detected in MuSC progenitors and mature muscle cells (in addition to tenocytes). Thus, our immunostaining results are consistent with scRNA-seq data from our and two other independent scRNA-seq data sets.

      We think that Scx expression in the adult myogenic lineage was not previously reported mainly because its expression level was low, and might be dismissed as spurious detection. Additionally, detecting such low expression levels requires sophisticated detection methods with high capture efficiency. Previous studies have noted limitations in transcript capture or transcription factor dropout in 10x Genomics-based datasets (Lambert et al., Cell, 2018; Pokhilko et al., Genome Res., 2021). Or, Scx was simply not a focus in prior studies amid other genes of interest. Our specific focus on Scx has led us to evaluate its expression in these data sets. We will add the above cited scRNA-seq data set (De Micheli et al., Cell Rep. 2020) and provide a discussion in the revised version.

      (2) A major point of the paper, as illustrated in Fig. 3, is that Scx-neg SCs fail to produce normal myofibers and renewed SCs following injury/regeneration. They mention in the text that there was no increased PCD by Caspase staining at 5 DPI. A failure of cell survival during the process of SC activation, proliferation, and cell fate determination (differentiation versus self-renewal) would explain most of the in vivo data. As such, this conclusion would seem to warrant a more detailed analysis in terms of at least one or two other time points and an independent method for detecting dead/dying cells (the in vitro data in Fig. 4F is also based on an assessment of activated Caspase to assess cell death). The in vitro data presented later in Fig. S4G, H do suggest an increase in cell loss during proliferative expansion of Scx-neg SCs. To what extent does cell loss (by whatever mechanism of cell death) explain both the in vivo findings of impaired regeneration and even the in vitro studies showing slower population expansion in the absence of Scx?

      We appreciate these constructive suggestions. Additional methods and different time points should be helpful in investigating SC cell loss in ScxcKO. Based on the number of available cKO animals, we will carefully choose additional time point(s) to assess PCD, using anti-active Caspase-3 immunostaining and another independent method (e.g., TUNNEL). Although the outcomes are uncertain, we will endeavor to obtain meaningful data from these experiments.

      (3) I'm not sure I understand the description of the data or the conclusions in the section titled "Basement membrane-myofiber interaction in control and Scx cKO mice". Is there something specific to the regeneration from Scx-neg myogenic progenitors, or would these findings be expected in any experimental condition in which myogenesis was significantly delayed, with much smaller fibers in the experimental group at 5 DPI?

      We very much appreciate this comment. We agree that there is unlikely anything specific about the regeneration from Scx-negative myogenic progenitors. Unfilled or empty ghost fibers (basement membrane remnant) are to be expected due to the small fiber and poor regeneration in the ScxcKO mice at 5 dpi. We will correct the subtitle and content accordingly.

      (4) The data presented in Fig. 4B showing differences in the purity of SC populations isolated by FACS depending on the reporter used are interesting and important for the field. The authors offer the explanation of exosomal transfer of Tdt from SCs to non-SCs. The data are consistent with this explanation, but no data are presented to support this. Are there any other explanations that the authors have considered and that could be readily tested?

      Thanks for highlighting this phenomenon. We struggled with the SC purity issue for a long time. The project started with using the R26RtdT reporter for tdT’s paraformaldehyde  resistant strong fluorescence (fixation) to aid visualization in vivo. Later, when we used the tdT signal to purify SCs by FACS, we found that only 80% sorted tdT+ cells are Pax7+. We then switched to the R26RYFP reporter, from which we achieved much higher purity (95%) of SCs (Pax7+) by FACS. As such, we also repeated and confirmed many in vivo experimental results using the R26RYFP reporter (included in the manuscript). Due to the low purity of tdT+SCs by FACS, we discontinued that mouse colony after we confirmed the superior utility of the R26RYFP reporter for SC isolation.

      We sincerely apologize for not being able to conduct further testable experiments on this intriguing phenomenon. However, this issue has since been addressed and published by Murach et al., iScience, (2021). Like our experience, they found non-satellite mononuclear cells with tdT fluorescence after TMX treatment when SCs were isolated via FACS. To determine this was not due to off-target recombination or a technical artifact from tissue processing, they conducted extensive analyses. They found that the tdT+ mononuclear cells included fibrogenic cells (fibroblasts and FAPs), immune cells/macrophages, and endothelial cells. Additionally, they confirmed the significant potential of extracellular vesicle (EV)-mediated cargo transfer, which facilitates the transfer of full-length tdT transcript from lineage-marked Pax7+ cells to those mononuclear cells. We will modify our text to include and acknowledge their contribution to this important point.

      (5) The Cut&Run data of Fig. 6 certainly provide evidence of direct Scx targets, especially since the authors used a novel knock-in strain for analyses. The enrichment of E-box motifs provides support for the 207 intersecting genes (scRNA-seq and Cut&Run) being direct targets. However, the rationale elaborated in the final paragraph of the Results section proposing how 4 of these genes account for the phenotypes on the Scx-neg cells and tissues is just speculation, however reasonable. These are not data, and these considerations would be more appropriate in the Discussion in the absence of any validation studies.

      We agree with this comment and will move this speculation into the discussion.

      Reviewer #2 (Public Review):

      Summary:

      Scx is a well-established marker for tenocytes, but the expression in myogenic-lineage cells was unexplored. In this study, the authors performed lineage-trace and scRNA-seq analyses and demonstrated that Scx is expressed in activated SCs. Further, the authors showed that Scx is essential for muscle regeneration using conditional KO mice and identified the target genes of Scx in myogenic cells, which differ from those of tendons.

      Strengths:

      Sometimes, lineage-trace experiments cause mis-expression and do not reflect the endogenous expression of the target gene. In this study, the authors carefully analyzed the unexpected expression of Scx in myogenic cells using some mouse lines and scRNA-seq data.

      We appreciate the comments and thank her/him for noting the strengths of our manuscript.

      Weaknesses:

      Scx protein expression has not been verified.

      We are aware of this weakness. We had previously used Western blotting (WB) using cultured SCs from control and ScxcKO mice, but did not detect endogenous Scx protein in the control. Hence, we used ScxCreERT2 lineage-tracing, Tg-ScxGFP expression, and ScxTy1 knock-in allele as complementary, even though indirect, ways to address this issue. Following the reviewer’s comment, we will purchase new anti-Scx antibodies and re-perform WB using cultured SCs. If the new antibodies fail to detect endogenous Scx by WB, we will then use immunofluorescence staining to detect endogenous Scx protein.

    2. eLife assessment

      This manuscript reports an important finding that the transcription factor Scleraxis regulates regenerative myogenesis by controlling the proliferation and differentiation of muscle stem cells. The evidence presented is compelling and supports the conclusions and the mechanisms by which this gene regulates satellite cell function. These data will be of interest to developmental, transcriptional, and stem cell biologists.

    3. Reviewer #1 (Public Review):

      This manuscript by Bai et al concerns the expression of Scleraxis (Scx) by muscle satellite cells (SCs) and the role of that gene in regenerative myogenesis. The authors report the expression of this gene associated with tendon development in satellite cells. Genetic deletion of Scx in SCs impairs muscle regeneration, and the authors provide evidence that SCs deficient in Scx are impaired in terms of population growth and cellular differentiation. Overall, this report provides evidence of the role of this gene, unexpectedly, in SC function and adult regenerative myogenesis.

      There are a few minor points of concern.

      (1) From the data in Figure 1, it appears that all of the SCs, assessed both in vitro and in vivo, express Scx. The authors refer to a scRNA-seq dataset from their lab and one report from mdx mouse muscle that also reveals this unexpected gene expression pattern. Has this been observed in many other scRNA-seq datasets? If not, it would be important to discuss potential explanations as to why this has not been reported previously.

      (2) A major point of the paper, as illustrated in Fig. 3, is that Scx-neg SCs fail to produce normal myofibers and renewed SCs following injury/regeneration. They mention in the text that there was no increased PCD by Caspase staining at 5 DPI. A failure of cell survival during the process of SC activation, proliferation, and cell fate determination (differentiation versus self-renewal) would explain most of the in vivo data. As such, this conclusion would seem to warrant a more detailed analysis in terms of at least one or two other time points and an independent method for detecting dead/dying cells (the in vitro data in Fig. 4F is also based on an assessment of activated Caspase to assess cell death). The in vitro data presented later in Fig. S4G, H do suggest an increase in cell loss during proliferative expansion of Scx-neg SCs. To what extent does cell loss (by whatever mechanism of cell death) explain both the in vivo findings of impaired regeneration and even the in vitro studies showing slower population expansion in the absence of Scx?

      (3) I'm not sure I understand the description of the data or the conclusions in the section titled "Basement membrane-myofiber interaction in control and Scx cKO mice". Is there something specific to the regeneration from Scx-neg myogenic progenitors, or would these findings be expected in any experimental condition in which myogenesis was significantly delayed, with much smaller fibers in the experimental group at 5 DPI?

      (4) The data presented in Fig. 4B showing differences in the purity of SC populations isolated by FACS depending on the reporter used are interesting and important for the field. The authors offer the explanation of exosomal transfer of Tdt from SCs to non-SCs. The data are consistent with this explanation, but no data are presented to support this. Are there any other explanations that the authors have considered and that could be readily tested?

      (5) The Cut&Run data of Fig. 6 certainly provide evidence of direct Scx targets, especially since the authors used a novel knock-in strain for analyses. The enrichment of E-box motifs provides support for the 207 intersecting genes (scRNA-seq and Cut&Run) being direct targets. However, the rationale elaborated in the final paragraph of the Results section proposing how 4 of these genes account for the phenotypes on the Scx-neg cells and tissues is just speculation, however reasonable. These are not data, and these considerations would be more appropriate in the Discussion in the absence of any validation studies.

    4. Reviewer #2 (Public Review):

      Summary:

      Scx is a well-established marker for tenocytes, but the expression in myogenic-lineage cells was unexplored. In this study, the authors performed lineage-trace and scRNA-seq analyses and demonstrated that Scx is expressed in activated SCs. Further, the authors showed that Scx is essential for muscle regeneration using conditional KO mice and identified the target genes of Scx in myogenic cells, which differ from those of tendons.

      Strengths:

      Sometimes, lineage-trace experiments cause mis-expression and do not reflect the endogenous expression of the target gene. In this study, the authors carefully analyzed the unexpected expression of Scx in myogenic cells using some mouse lines and scRNA-seq data.

      Weaknesses:

      Scx protein expression has not been verified.

    1. eLife assessment

      This article describes a novel mechanism allowing the insect Drosophila to combat pathogenic enteric pathogens while preserving the beneficial indigenous microbiota. The authors provide compelling evidence that oral infection of Drosophila larvae by pathogenic bacteria activate a valve that traps the intruders in the anterior midgut, allowing them to be killed by antimicrobial peptides. This important work substantially advances our understanding of pathogen clearance in the insect gut.

    2. Reviewer #1 (Public Review):

      Tleiss et al. demonstrate that while commensal Lactiplantibacillus plantarum freely circulate within the intestinal lumen, pathogenic strains such as Erwinia carotovora or Bacillus thuringiensis are blocked in the anterior midgut where they are rapidly eliminated by antimicrobial peptides. This sequestration of pathogenic bacteria in the anterior midgut requires the Duox enzyme in enterocytes, and both TrpA1 and Dh31 in enteroendocrine cells. This effect induces muscular muscle contraction, which is marked by the formation of TARM structures (thoracic ary-related muscles). This muscle contraction-related blocking happens early after infection (15mins). On the other side, the clearance of bacteria is done by the IMD pathway possibly through antimicrobial peptide production while it is dispensable for the blockage. Genetic manipulations impairing bacterial compartmentalization result in abnormal colonization of posterior midgut regions by pathogenic bacteria. Despite a functional IMD pathway, this ectopic colonization leads to bacterial proliferation and larval death, demonstrating the critical role of bacteria anterior sequestration in larval defense.

      This important work substantially advances our understanding of the process of pathogen clearance by identifying a new mode of pathogen eradication from the insect gut. The evidence supporting the authors' claims is solid and would benefit from more rigorous experiments.

      (1) The authors performed the experiments on Drosophila larvae. I wonder whether this model could extend to adult flies since they have shown that the ROS/TRPA1/Dh31 axis is important for gut muscle contraction in adult flies. If not, how would the authors explain the discrepancy between larvae and adults?

      (2) The authors performed their experiments and proposed the models based on two pathogenic bacteria and one commensal bacterial at a relatively high bacterial dose. They showed that feeding Bt at 2X1010 or Ecc15 at 4X108 did not induce a blockage phenotype. I wonder whether larvae die under conditions of enteric infection with low concentrations of pathogenic bacteria. If larvae do not show mortality, what is the mechanism for resisting low concentrations of pathogenic bacteria? Why is this model only applied to high-dose infections?

      (3) The authors claim that the lock of bacteria happens at 15 minutes while killing by AMPs happens 6-8 hours later. What happened during this period? More importantly, is IMD activity induced in the anterior region of the larval gut in both Ecc15 and Bt infection at 6 hours after infection? Are they mostly expressed in the anterior midgut in both bacterial infections? Several papers have shown quite different IMD activity patterns in the Drosophila gut. Zhai et al. have shown that in adult Drosophila, IMD activity was mostly absent in the R2 region as indicated by dpt-lacZ. Vodovar et al. have shown that the expression of dpt-lacZ is observable in proventriculus while Pe is not in the same region. Tzou et al. showed that Ecc15 infection induced IMD activity in the anterior midgut 24 hours after infection. Using TrpA1 and Dh31 mutant, the authors found both Ecc15 and Bt in the posterior midgut. Why are they not evenly distributed along the gut? Last but not least, does the ROS/TrpA1/Dh31 axis affect AMP expression?

      (4) The TARM structure part is quite interesting. However, the authors did not show its relevance in their model. Is this structure the key-driven force for the blocking phenotype and killing phenotype? Is the ROS/TrpA1/Dh31 axis required to form this structure?

    3. Reviewer #2 (Public Review):

      This article describes a novel mechanism of host defense in the gut of Drosophila larvae. Pathogenic bacteria trigger the activation of a valve that blocks them in the anterior midgut where they are subjected to the action of antimicrobial peptides. In contrast, beneficial symbiotic bacteria do not activate the contraction of this sphincter, and can access the posterior midgut, a compartment more favorable to bacterial growth.

      Strengths:

      The authors decipher the underlying mechanism of sphincter contraction, revealing that ROS production by Duox activates the release of DH31 by enteroendocrine cells that stimulate visceral muscle contractions. The use of mutations affecting the Imd pathway or lacking antimicrobial peptides reveals their contribution to pathogen elimination in the anterior midgut.

      Weaknesses:

      - The mechanism allowing the discrimination between commensal and pathogenic bacteria remains unclear.

      - The use of only two pathogens and one symbiotic species may not be sufficient to draw a conclusion on the difference in treatment between pathogenic and symbiotic species.

      - We can also wonder how the process of sphincter contraction is affected by the procedure used in this study, where larvae are starved. Does the sphincter contraction occur in continuous feeding conditions? Since larvae are continuously feeding, is this process physiologically relevant?

    1. eLife assessment

      This study represents valuable findings on the asymmetric connectivity pattern of two different types of CA3 pyramidal cell types showing that while athorny cells receive strong inputs from all other cell types, thorny cells receive weaker inputs from athorny neurons. Computational modeling is used to evaluate the impact of this connectivity scheme on the sequential activation of different cell types during sharp wave ripples. The experimental evidence supporting the authors' claims is solid, although improvements to the modelling aspect of the study would strengthen the study.

    2. Reviewer #1 (Public Review):

      Summary:

      Sammons, Masserini et al. examine the connectivity of different types of CA3 pyramidal cells ("thorny" and "athorny"), and how their connectivity putatively contributes to their relative timing in sharp-wave-like activity. First, using patch-clamp recordings, they characterize the degree of connectivity within and between athorny and thorny cells. Based upon these experimental results, they compute a synaptic product matrix, and use this to inform a computational model of CA3 activity. This model finds that this differential connectivity between these populations, augmented by two different types of inhibitory neurons, can account for the relative timing of activity observed in sharp waves in vivo.

      Strengths:

      The patch-clamp experiments are exceptionally thorough and well done. These are very challenging experiments and the authors should be commended for their in-depth characterization of CA3 connectivity.

      Weaknesses:

      (1) The computational elements of this study feel underdeveloped. Whereas the authors do a thorough job experimentally characterizing connections between excitatory neurons, the inhibitory neurons used in the model seem to be effectivity "fit neurons" and appear to have been tuned to produce the emergent properties of CA3 sharp wave-like activity. Although I appreciate the goal was to implicate CA3 connectivity contributions to activity timing, a stronger relationship seems like it could be examined. For example, did the authors try to "break" their model? It would be informative if they attempted different synaptic product matrices (say, the juxtaposition of their experimental product matrix) and see whether experimentally-derived sequential activity could not be elicited. It seems as though this spirit of analysis was examined in Figure 4C, but only insofar as individual connectivity parameters were changed in isolation.

      (2) Additional explanations of how parameters for interneurons were incorporated in the model would be very helpful. As it stands, it is difficult to understand the degree to which the parameters of these neurons are biologically constrained versus used as fit parameters to produce different time windows of activity in types of CA3 pyramidal cells.

    3. Reviewer #2 (Public Review):

      Sharp wave ripples are transient oscillations occurring in the hippocampus that are thought to play an important role in organising temporal sequences during the reactivation of neuronal activity. This study addresses the mechanism by which these temporal sequences are generated in the CA3 region focusing on two different subtypes of pyramidal neurons, thorny and athorny. Using high-quality electrophysiological recordings from up to 8 pyramidal neurons at a time the authors measure the connectivity rates between these pyramidal cell subtypes in a large dataset of 348 cells. This is a significant achievement and provides important data. The most striking finding is how similar connection characteristics are between cell types. There are no differences in synaptic strength or failure rates and some small differences in connectivity rates and short-term plasticity. Using model simulations, the authors explore the implications of the differences in connectivity rates for the temporal specificity of pyramidal cell firing within sharp-wave ripple events. The simulations show that the experimentally observed connectivity rates may contribute to the previously observed temporal sequence of pyramidal cell firing during sharp wave ripples.

      The conclusions drawn from the simulations are not experimentally tested so remain theoretical. In the simple network model, the authors include basket cell and anti-SWR interneurons but the connectivity of these cell types is not measured experimentally and variations in interneuron parameters may also influence temporal specificity of firing. In addition, the influence of short-term plasticity measured in their experiments is not tested in the model. Interestingly, the experimental data reveal a large variability in many of the measured parameters. This may strongly influence the firing of pyramidal cells during SWRs but it is not represented within the model which uses the averaged data.

    4. Reviewer #3 (Public Review):

      Summary:

      The hippocampal CA3 region is generally considered to be the primary site of initiation of sharp wave ripples-highly synchronous population events involved in learning and memory although the precise mechanism remains elusive. A recent study revealed that CA3 comprises two distinct pyramidal cell populations: thorny cells that receive mossy fiber input from the dentate gyrus, and athorny cells that do not. That study also showed that it is athorny cells in particular that play a key role in sharp wave initiation. In the present work, Sammons, Masserini, and colleagues expand on this by examining the connectivity probabilities among and between thorny and athorny cells. First, using whole-cell patch clamp recordings, they find an asymmetrical connectivity pattern, with athorny cells receiving the most synaptic connections from both athorny and thorny cells, and thorny cells receiving fewer. They then demonstrate in spiking neural network simulations how this asymmetrical connectivity may underlie the preferential role of athorny cells in sharp wave initiation.

      Strengths:

      The authors provide independent validation of some of the findings by Hunt et al. (2018) concerning the distinction between thorny and athorny pyramidal cells in CA3 and advance our understanding of their differential integration in CA3 microcircuits. The properties of excitatory connections among and between thorny and athorny cells described by the authors will be key in understanding CA3 functions including, but not limited to, sharp wave initiation.

      As stated in the paper, the modeling results lend support to the idea that the increased excitatory connectivity towards athorny cells plays a key role in causing them to fire before thorny cells in sharp waves. More generally, the model adds to an expanding pool of models of sharp wave ripples which should prove useful in guiding and interpreting experimental research.

      Weaknesses:

      The mechanism by which athorny cells initiate sharp waves in the model is somewhat confusingly described. As far as I understood, random fluctuations in the activities of A and B neurons provide windows of opportunity for pyramidal cells to fire if they have additionally recovered from adaptive currents. Thorny and athorny pyramidal cells are then set in a winner-takes-all competition which is quickly won by the athorny cells. The main thesis of the paper seems to be that athorny cells win this competition because they receive more inputs both from themselves and from thorny cells, hence, the connectivity "underlies the sequential activation". However, it is also stated that athorny cells activate first due to their lower rheobase and steeper f-I curve, and it is also indicated in the methods that athorny (but not thorny) cells fire in bursts. It seems that it is primarily these features that make them fire first, something which apparently happens even when the A to A connectivity is set to 0-albeit with a very small lag. Perhaps the authors could further clarify the differential role of single cell and network parameters in determining the sequential activation of athorny and thorny cells. Is the role of asymmetric excitatory connectivity only to enhance the initial intrinsic advantage of athorny cells? If so, could this advantage also be enhanced in other ways?

      Although a clear effort has been made to constrain the model with biological data, too many degrees of freedom remain that allow the modeler to make arbitrary decisions. This is not a problem in itself, but perhaps the authors could explain more of their reasoning and expand upon the differences between their modeling choices and those of others. For example, what are the conceptual or practical advantages of using adaptation in pyramidal neurons as opposed to short-term synaptic plasticity as in the model by Hunt et al.? Relatedly, what experimental observations could validate or falsify the proposed mechanisms?

      In the data by Hunt et al., thorny cells have a higher baseline (non-SPW) firing rate, and it is claimed that it is actually stochastic correlations in their firing that are amplified by athorny cells to initiate sharp waves. However, in the current model, the firing of both types of pyramidal cells outside of ripples appears to be essentially zero. Can the model handle more realistic firing rates as described by Hunt et al., or as produced by e.g., walking around an environment tiled with place cells, or would that trigger SPWs continuously?

    1. eLife assessment

      This important work describes the activation of astrocytes via the nuclear translocation of PKM2 in an animal model of multiple sclerosis. This study provides solid evidence of the interaction between TRIM21 and PKM2 as the crucial molecular event leading to the translocation of PKM2 and a metabolic shift towards glycolysis dominance, fostering proliferation in stimulated astrocytes. This finding is significant as it underscores the potential of targeting glycolytic metabolism to mitigate neurological diseases mediated by astrocytes, offering a strong rationale for potential therapeutic interventions. However, control experiments and imaging analyses with higher magnification images should be performed to better support the main claims of the study.

    1. eLife assessment

      This study reports an important discovery highlighting the essential role of the putative ion channel, TMC7, in acrosome formation during sperm development and thus male fertility. The evidence for the requirement of TMC7 in acrosome biogenesis and sperm function is convincing, although its function as an ion channel remains to be further determined. Overall, this work will be of great interest to developmental biologists and ion channel physiologist alike.

    2. Reviewer #1 (Public Review):

      Summary:

      TMC7 knockout mice were generated by the authors and the phenotype was analyzed. They found that Tmc7 is localized to Golgi and is needed for acrosome biogenesis.

      Strengths:

      The phenotype of infertility is clear, and the results of TMC7 localization and the failed acrosome formation are highly reliable. In this respect, they made a significant discovery regarding spermatogenesis.

      In the original version, I pointed out the gap between their pH/calcium imaging data and the hypothesis of ion channel function of TMC7 in the Golgi. Now the author agrees and has changed the description to be reasonable. Additional experiments were also performed, and I can say that they have answered my concern adequately.

      I would say it is good to add any presumed mechanism for the observed changes in pH and calcium concentration in the cytoplasm this time.

    3. Reviewer #2 (Public Review):

      Summary:

      This study presents a significant finding that enhances our understanding of spermatogenesis. TMC7 belongs to a family of transmembrane channel-like proteins (TMC1-8), primarily known for their role in the ear. Mutations to TMC1/2 are linked to deafness in humans and mice and were originally characterized as auditory mechanosensitive ion channels. However, the function of the other TMC family members remains poorly characterized. In this study, the authors begin to elucidate the function of TMC7 in acrosome biogenesis during spermatogenesis. Through analysis of transcriptomics datasets, they elevated levels of TMC7 in round spermatids in both mouse and human testis. They then generate Tmc7-/- mice and find that male mice exhibit smaller testes and complete infertility. Examination of different developmental stages reveals spermatogenesis defects, including with reduced sperm count, elongated spermatids and large vacuoles. Additionally, abnormal acrosome morphology are observed beginning at the early-stage Golgi phase, indicating TMC7's involvement in proacrosomal vesicle trafficking and fusion. They observed localization of TMC7 in the cis-Golgi and suggest that its presence is required for maintaining Golgi integrity, with Tmc7-/- leading to reduced intracellular Ca2+, elevated pH and increased ROS levels, likely resulting in spermatid apoptosis. Overall, the work delineates a new function of TMC7 in spermatogenesis and the authors propose that that its ion channel and/or scramblase activity is likely important for Golgi homeostasis. This work is of significant interest to the community and is of high quality.

      Strengths:

      The biggest strength of the paper is the phenotypic characterization of the TMC7-/- mouse model, which has clear acrosome biogenesis/spermatogenesis defects. This is the main claim of the paper and it is supported with the data that are presented.

      Weaknesses:

      It isn't clear whether TMC7 functions as an ion channel from the current data presented in this paper, but the authors are careful in their interpretation and present this merely as a hypothesis supporting this idea.

    4. Reviewer #3 (Public Review):

      Summary:

      In this study, Wang et al. have demonstrated that TMC7, a testis-enriched multipass transmembrane protein, is essential for male reproduction in mice. Tmc7 KO male mice are sterile due to reduced sperm count and abnormal sperm morphology. TMC7 co-localizes with GM130, a cis-Golgi marker, in round spermatids. The absence of TMC7 results in reduced levels of Golgi proteins, elevated abundance of ER stress markers, as well as changes of Ca2+ and pH levels in the KO testis. However, further confirmation is required because the analyses were performed with whole testis samples in spite of the differences in the germ cell composition in WT and KO testis. In addition, the causal relationships between the reported anomalies await thorough interrogation

      Strengths:

      By using PD21 testes, the revised assays have consolidated that depletion of TMC7 leads to a reduced level of Ca2+ and an elevated level of ROS in the male germ cells. The immunohistochemistry analyses have clearly indicated the reduced abundance of GM130, P115, and GRASP65 in the knockout testis.

      Weaknesses:

      The Discussion section contains sentences reiterating the Introduction and Results of this manuscript (e.g., Lines 79-85 and 231-236; Lines 175-179 and 259-263). Those read repetitive and can be removed.

      Future studies are required to decipher how TMC7 stabilizes Golgi structure, coordinates vesicle transport, and maintains the germ cell homeostasis.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      TMC7 knockout mice were generated by the authors and the phenotype was analyzed. They found that Tmc7 is localized to Golgi and is needed for acrosome biogenesis.

      Strengths:

      The phenotype of infertility is clear, and the results of TMC7 localization and the failed acrosome formation are highly reliable. In this respect, they made a significant discovery regarding spermatogenesis.

      Weaknesses:

      There are also some concerns, which are mainly related to the molecular function of TMC7 and Figure 5.

      (1) It is understandable that TMC7 exhibits some channel activity in the Golgi and somehow affects luminal pH or Ca2+, leading to the failure of acrosome formation. On the other hand, since they are conducting the pH and calcium imaging from the cytoplasm, I do not think that the effect of TMC7 channel function in Golgi is detectable with their methods.

      We agree with the reviewer that there are no direct evidences showing the effect of TMC7 channel function in Golgi. We have changed the description in the revised manuscript.

      (2) Rather, it is more likely that they are detecting apoptotic cells that have no longer normal ion homeostasis.

      We thank the reviewer for raising this concern. We apologize for not labeling the postnatal stage in original Figure 5. We measured intracellular Ca2+, pH and ROS in PD30 testes (revised Fig. S6a-c), no apoptotic cells were observed at this stage (revised Fig. S6e, f). Apoptotic cells were found in the seminiferous tubules and cauda epididymis of 9-week-old Tmc7–/– mice (revised Fig. 5e-f). We have included TUNEL data in testis of PD21, PD30 and 9-week-old mice (revised Fig. 5e, f and Fig. S6e, f). In accordance with our findings, Tmc1 mutation has also been shown to result in reduced Ca2+ permeability, thus triggering hair cell apoptosis (Fettiplace, R, PNAS. 2022) [1].

      (3) Another concern is that n is only 3 for these imaging experiments.

      As suggested by the reviewer, more replicates were included in imaging experiments.

      Reviewer #2 (Public Review):

      Summary:

      This study presents a significant finding that enhances our understanding of spermatogenesis. TMC7 belongs to a family of transmembrane channel-like proteins (TMC1-8), primarily known for their role in the ear. Mutations to TMC1/2 are linked to deafness in humans and mice and were originally characterized as auditory mechanosensitive ion channels. However, the function of the other TMC family members remains poorly characterized. In this study, the authors begin to elucidate the function of TMC7 in acrosome biogenesis during spermatogenesis. Through analysis of transcriptomics datasets, they identify TMC7 as a transmembrane channel-like protein with elevated transcript levels in round spermatids in both mouse and human testis. They then generate Tmc7-/- mice and find that male mice exhibit smaller testes and complete infertility. Examination of different developmental stages reveals spermatogenesis defects, including reduced sperm count, elongated spermatids, and large vacuoles. Additionally, abnormal acrosome morphology is observed beginning at the early-stage Golgi phase, indicating TMC7's involvement in proacrosomal vesicle trafficking and fusion. They observed localization of TMC7 in the cis-Golgi and suggest that its presence is required for maintaining Golgi integrity, with Tmc7-/- leading to reduced intracellular Ca2+, elevated pH, and increased ROS levels, likely resulting in spermatid apoptosis. Overall, the work delineates a new function of TMC7 in spermatogenesis and the authors suggest that its ion channel activity is likely important for Golgi homeostasis. This work is of significant interest to the community and is of high quality.

      Strengths:

      The biggest strength of the paper is the phenotypic characterization of the TMC7-/- mouse model, which has clear acrosome biogenesis/spermatogenesis defects. This is the main claim of the paper and it is supported by the data that are presented.

      Weaknesses:

      The claim is that TMC7 functions as an ion channel. It is reasonable to assume this given what has been previously published on the more well-characterized TMCs (TMC1/2), but the data supporting this is preliminary here, and more needs to be done to solidify this hypothesis. The authors are careful in their interpretation and present this merely as a hypothesis supporting this idea.

      We appreciate the insightful comment. It is indeed a limitation of our study that we lack strong evidences to support that TMC7 functions as an ion channel. We have planned to conduct cellular electrophysiology in GC-1 cells heterologous expression of TMC7. However, TMC7 was trapped in the endoplasmic reticulum like TMC1 and TMC2 (Yu X, PNAS. 2020)[2], and failed to localize to the Golgi. According to the reviewer’s suggestion, we have made careful and more detailed interpretation the molecular function of TMC7 in the revised manuscript.

      Reviewer #3 (Public Review):

      Summary:

      In this study, Wang et al. have demonstrated that TMC7, a testis-enriched multipass transmembrane protein, is essential for male reproduction in mice. Tmc7 KO male mice are sterile due to reduced sperm count and abnormal sperm morphology. TMC7 co-localizes with GM130, a cis-Golgi marker, in round spermatids. The absence of TMC7 results in reduced levels of Golgi proteins, elevated abundance of ER stress markers, as well as changes of Ca2+ and pH levels in the KO testis. However, further confirmation is required because the analyses were performed with whole testis samples in spite of the differences in the germ cell composition in WT and KO testis. In addition, the causal relationships between the reported anomalies await thorough interrogation.

      Strengths:

      The microscopic images are of great quality, all figures are properly arranged, and the entire manuscript is very easy to follow.

      Weaknesses:

      (1) Tmc7 KO male mice show multiple anomalies in sperm production and morphogenesis, such as reduced sperm count, abnormal sperm head, and deformed midpiece. Thus, it is confusing that the authors focused solely on impaired acrosome biogenesis.

      We are grateful to your comments and suggestions. We agree and have added these defects in spermiogenesis of Tmc7–/– mice in the abstract and discussion sections of revised manuscript.

      (2) Further investigations are warranted to determine whether the abnormalities reported in this manuscript (e.g., changes in protein, Ca2+, and pH levels) are directly associated with the molecular function of TMC7 or are the byproducts of partially arrested spermiogenesis. Please find additional comments in "Recommendations for the authors".

      Thank you for raising this concern. Per your comments, we have included data of intracellular Ca2+, pH and ROS in PD21 testes. The intracellular homeostasis was impaired as early as PD21, indicating TMC7 depletion impairs cellular homeostasis which in turn results in arrested spermiogenesis.

      Recommendations for the authors:

      Reviewing Editor (Recommendations For The Authors):

      As noted by all three reviewers, current flow cytometry data does not necessarily support the 'ion channel' hypothesis, thus the phenotypic analysis is compelling but the molecular mechanism of how TMC7 facilitates acrosome biogenesis remains incomplete. It is highly recommended for the authors to at least discuss or test alternative hypotheses (as reviewer #2 suggested) such as the possibility of acting as 'lipid scramblase'. Also, the authors need to provide further explanation for other morphological defects if TMC7 is truly a functional ion channel in Golgi (and thus later at acrosome), which is also related to the key question of whether TMC7 is a functional ion channel.

      We thank the reviewing editor for the comments and suggestions. We agree that our study lack strong evidences to support that TMC7 functions as an ion channel. We have discussed the possibility of TMC7 acting as 'lipid scramblase' as suggested. We have also included data of intracellular Ca2+, pH and ROS in PD21, PD30 testes.

      Indeed, Tmc7–/– mice exhibits other defects including abnormal head morphology and disorganized mitochondrial sheaths. As TMC7 is localized to the cis-Golgi apparatus and is required for maintaining Golgi integrity. Previous studies on Golgi localized proteins including GOPC (Yao R, PNAS. 2002)[2], HRB (Kang-Decker N. Science. 2001)[3] and PICK1(Xiao N, JCI. 2009)[4] exhibit similar defects in spermiogenesis with Tmc7–/– mice. It is possible that defects morphologies in Tmc7–/– mice might be due to impaired function of Golgi.

      Reviewer #1 (Recommendations For The Authors):

      (1) The authors should provide more details about the imaging experiments using FACS. Since they only describe catalog numbers (Beyotime, S1056, S1006, S0033S) for imaging reagents, it is not immediately clear what reagents they actually used. Since they used Fluo3, BCECF, and DCFH, it would be better to mention their names.

      Thanks. We have provided more detailed antibody information as suggested.

      (2) I am also concerned that in the FACS there is no information at all about laser wavelength and filter properties. This is especially important for BCECF because the wavelength spectrum changes with pH. Also, if there are any positive controls for these imaging reagents, such as ionophores, it would be more convincing to include them.

      Thank you for your comment. Excitation wavelength is 488nm for detecting Ca2+, pH and ROS in FACS. BCECF is the most popular pH probe to monitor cellular pH and the reagent from Beyotime (S1006) has been used by other studies (Chen S, Blood. 2016)[5], (Liu H, Cell Death Dis. 2022)[6]. To make the results more reliable, we have repeated these experiments in PD21 testes (revised Figure 5a-c). No positive controls for these reagents were used in our experiments.

      (3) As noted above, it is better to avoid directly linking the cell's abnormal ion homeostasis to TMC7 ion channel function in the text. The discussion should be changed to emphasize that the TMC7-deficient cells are apoptotic and that these physiological phenomena are occurring as a side effect of this apoptosis.

      Thank you for raising this concern. We agree with the reviewer that there are no direct evidences showing the effect of TMC7 channel function in Golgi and we have changed the description in the revised manuscript.

      We performed new experiment to measure apoptosis and intracellular Ca2+, pH and ROS in PD21 testes. No apoptotic cells were observed at this stage. However, impaired cellular homeostasis was still found in testis of PD21 Tmc7-/- mice. These data suggest that TMC7 depletion impairs cellular homeostasis and hence induces spermatid apoptosis.

      (4) While I understand that it appears to be difficult to experimentally verify the ion channel function of TMC7, it may be supportive to compare its amino acid sequence and/or 3D predicted structure with that of TMC1/2. Including a supplemental figure for this purpose would emphasize the possibility that TMC7 functions as an ion channel.

      We thank the reviewer for making this great suggestion. We compared the amino acid sequence and structure of TMC1, TMC2 with TMC7 respectively. TMC1 had 81% sequence similarity with TMC7 and the RMSD (Root Mean Square Deviation) was 3.079. TMC2 had 82% sequence similarity with TMC7, the RMSD was 2.176. These data suggest that TMC7 has similar amino acid sequence and predicted structure with TMC1/2 and might functions as an ion channel. We have included the predicted structures in revised Fig. S7.

      Author response image 1.

      Reviewer #2 (Recommendations For The Authors):

      I do not have any experimental comments or concerns to address, but I do ask that the authors consider an alternative hypothesis. Based on prior data demonstrating that TMC1 is a mechanosensitive ion channel, the authors reasonably assume that TMC7 may also function as an ion channel. Although the authors observe alterations in cytosolic Ca2+ and pH upon loss of TMC7 by flow cytometry, which begins to support this hypothesis, these data do not directly demonstrate ion channel activity.

      I was wondering if the authors had considered whether TMC7 could also function as a lipid scramblase. TMC1 has also been proposed to function as a Ca2+-inhibited scramblase, where knockout of TMC1 leads to a loss of phosphatidylserine (PS) exposure and membrane blebbing at the apical region of hair cells (Ballesteros, A. and Swartz, K., Science Advances, 2022). Furthermore, TMC proteins are structurally related to the Anoctamin/TMEM16 family of chloride channels and lipid scramblases, where TMEM16A-B are bona fide Ca2+-activated chloride channels, and TMEM16C-H are characterized as Ca2+-dependent scramblases. Based on their structural similarity and the observation that TMC1 may also exhibit lipid scrambling properties based on the PS exposure, I wonder if the authors may have data that support a TMC7 scramblase hypothesis. I was intrigued by this idea, especially given the authors' observations of large vacuoles in the seminiferous tubules and cauda epididymis and the vesicle accumulation phenotype in their TEM data. Incorporating this hypothesis into the discussion section, at minimum, could provide a valuable perspective, and this line of thought may lead to interesting data interpretation throughout the paper.

      We thank the reviewer for the valuable suggestion. We have discussed the possibility of TMC7 acting as 'lipid scramblase' as suggested.

      Reviewer #3 (Recommendations For The Authors):

      (1) Gene symbols should be italicized, and protein symbols should be capitalized.

      Thanks. We have made changes to the manuscript as recommended.

      (2) Tmc7 KO males show reduced sperm count, which alters the germ cell composition in the testis (Figure 2g). Thus, it is inappropriate to compare protein levels using whole testis lysates (Figure 3e, 4h, 5d, 5f). Instead, the same immunoblotting analyses could be done with purified round spermatids or 3-wk-old testis. Likewise, the significance of the intracellular Ca2+ and pH measurements is potentially diminished by the differences in the germ cell composition in WT and KO mice.

      We appreciate this constructive suggestion. We agree with the reviewer that whole testis lysates diminished the differences between WT and _Tmc7-/-_mice. However, we are unable purify round spermatids due to the lack of specific markers.

      (3) Figures 2i, 2j: How sperm motility was measured should be specified in the Methods.

      We thank you for your significant reminding and have added sperm motility assessment in Methods section.

      (4) Figure 4g: It does not make sense to compare the fluorescence intensity of these proteins without making sure that the seminiferous tubules are in the same stage. As shown in Figures S5a and S5b, TMC7 exhibits varied abundance in spermatids at different steps.

      We thank the reviewer for the insightful comment. We have replaced images in the same stage seminiferous tubules and compared the fluorescence intensity of new images as suggested.

      (5) Figure 4h: How were the band intensities measured? The third band from the left is visually stronger than the first one, but it does not seem to be so according to the column graph. The reviewer measured the intensity of GRASP65 bands relative to alpha-tubulin by ImageJ and obtained relative intensities of 0.35, 0.87, 0.6, and 0.08 for the bands from left to right. Additional replicates of the western blots should be included in the supplementary figures.

      Thank you for this insightful comment. The density and size of the blots were quantified by Image J. We have checked the first band from the left of GRASP65 and it seems that the protein was not fully transferred onto the PVDF membrane. We have performed new experiments and replaced the original bands (Revised Fig. 4h). Additional replicates of the western blots have been included in revised Fig. S8.

      (6) Figures 5a, 5b: Based on the observation of abnormal intracellular Ca2+ and pH levels in the KO germ cells, the authors concluded that TMC7 maintains the homeostasis of Golgi pH and ion (Lines 223-224, 263-264). However, intracellular Ca2+ and pH levels do not directly reflect those in the Golgi apparatus.

      We thank the reviewer for this important comment. We agree and have changed “Golgi” to “intracellular” as suggested.

      (7) Figure 5c: ROS is produced during apoptosis. Thus, it is not appropriate to conclude that the increased ROS levels in Tmc7 KO germ cells lead to apoptosis.

      According to the reviewer’s comment, we measured ROS and apoptosis in testis of PD21 and PD30 mice. ROS levels were increased, but no apoptotic cells were observed in testis of PD21 and PD30 Tmc7–/– mice. Apoptotic cells were observed in testis of 9-week-old Tmc7–/– mice (Revised Fig. 5e-f). These data suggest that TMC7 depletion results in the accumulation of ROS, thereby leads to apoptosis.

      (1) Fettiplace, R., D.N. Furness, and M. Beurg, The conductance and organization of the TMC1-containing mechanotransducer channel complex in auditory hair cells. Proc Natl Acad Sci U S A, 2022. 119(41): p. e2210849119.

      (2) Yu, X., et al., Deafness mutation D572N of TMC1 destabilizes TMC1 expression by disrupting LHFPL5 binding. Proc Natl Acad Sci U S A, 2020. 117(47): p. 29894-29903.

      (3) Kang-Decker, N., et al., Lack of acrosome formation in Hrb-deficient mice. Science, 2001. 294(5546): p. 1531-3.

      (4) Xiao, N., et al., PICK1 deficiency causes male infertility in mice by disrupting acrosome formation. J Clin Invest, 2009. 119(4): p. 802-12.

      (5) Chen, S., et al., Sympathetic stimulation facilitates thrombopoiesis by promoting megakaryocyte adhesion, migration, and proplatelet formation. Blood, 2016. 127(8): p. 1024-35.

      (6) Liu, H., et al., PRMT5 critically mediates TMAO-induced inflammatory response in vascular smooth muscle cells. Cell Death Dis, 2022. 13(4): p. 299.

    1. eLife assessment

      This manuscript reports valuable findings on the role of the Srs2 protein in turning off the DNA damage signaling response initiated by Mec1 (human ATR) kinase. The data provide solid evidence that Srs2 interaction with PCNA and ensuing SUMO modification is required for checkpoint downregulation. However, experimental evidence with regard to the model that Srs2 acts at gaps after camptothecin-induced DNA damage is currently lacking. The work will be of interest to cell biologists studying genome integrity but would be strengthened by considering the possible role of Rad51 and its removal.

    2. Reviewer #1 (Public Review):

      Overall, the data presented in this manuscript is of good quality. Understanding how cells control RPA loading on ssDNA is crucial to understanding DNA damage responses and genome maintenance mechanisms. The authors used genetic approaches to show that disrupting PCNA binding and SUMOylation of Srs2 can rescue the CPT sensitivity of rfa1 mutants with reduced affinity for ssDNA. In addition, the authors find that SUMOylation of Srs2 depends on binding to PCNA and the presence of Mec1.

      Noted weaknesses include the lack of evidence supporting that Srs2 binding to PCNA and its SUMOylation occur at ssDNA gaps, as proposed by the authors. Also, the mutants of Srs2 with impaired binding to PCNA or impaired SUMOylation showed no clear defects in checkpoint dampening, and in some contexts, even resulted in decreased Rad53 activation. Therefore, key parts of the paper would benefit from further experimentation and/or clarification.

      Major Comments

      (1) The central model proposed by the authors relies on the loading of PCNA at the 3' junction of an ssDNA gap, which then mediates Srs2 recruitment and RPA removal. While several aspects of the model are consistent with the data, the evidence that it is occurring at ssDNA gaps is not strong. The experiments mainly used CPT, which generates mostly DSBs. The few experiments using MMS, which mostly generates ssDNA gaps, show that Srs2 mutants lead to weaker rescue in this context (Figure S1). How do the authors explain this discrepancy? In the context of DSBs, are the authors proposing that Srs2 is engaging at later steps of HR-driven DSB repair where PCNA gets loaded to promote fill-in synthesis? If so, is RPA removal at that step important for checkpoint dampening? These issues need to be addressed and the final model adjusted.

      (2) The data in Figure 3 showing that Srs2 mutants reduce Rad53 activation in the rfa1-zm2 mutant are confusing, especially given the claim of an anti-checkpoint function for Srs2 (in which case Srs2 mutants should result in increased Rad53 activation). The authors propose that Rad53 is hyperactivated in rfa1-zm2 mutant because of compromised ssDNA protection and consequential DNA lesions, however, the effects sharply contrast with the central model. Are the authors proposing that in the rfa1-zm2 mutant, the compromised protection of ssDNA supersedes the checkpoint-dampening effect? Perhaps a schematic should be included in Figure 3 to depict these complexities and help the reader. The schematic could also include the compensatory dampening mechanisms like Slx4 (on that note, why not move Figure S2 to a main figure?... and even expand experiments to better characterize the compensatory mechanisms, which seem important to help understand the lack of checkpoint dampening effect in the Srs2 mutants)

      (3) The authors should demarcate the region used for quantifying the G1 population in Figure 3B and explain the following discrepancy: By inspection of the cell cycle graph, all mutants have lower G1 peak height compared to WT (CPT 2h). However, in the quantification bar graph at the bottom, ΔPIM has higher G1 population than the WT.

    3. Reviewer #2 (Public Review):

      Summary:

      This is an interesting paper that delves into the post-translational modifications of the yeast Srs2 helicase and proteins with which it interacts in coping with DNA damage. The authors use mutants in some interaction domains with RPA and Srs2 to argue for a model in which there is a balance between RPA binding to ssDNA and Srs2's removal of RPA. The idea that a checkpoint is being regulated is based on observing Rad53 and Rad9 phosphorylation (so there are the attributes of a checkpoint), but evidence of cell cycle arrest is lacking. The only apparent delay in the cell cycle is the re-entry into the second S phase (but it could be an exit from G2/M); but in any case, the wild-type cells enter the next cell cycle most rapidly. No direct measurement of RPA residence is presented.

      Strengths:

      Data concern viability assays in the presence of camptothecin and in the post-translational modifications of Srs2 and other proteins.

      Weaknesses:

      There are a couple of overriding questions about the results, which appear technically excellent. Clearly, there is an Srs2-dependent repair process here, in the presence of camptothecin, but is it a consequence of replication fork stalling or chromosome breakage? Is repair Rad51-dependent, and if so, is Srs2 displacing RPA or removing Rad51 or both? If RPA is removed quickly what takes its place, and will the removal of RPA result in lower DDC1-MEC1 signaling?

      Moreover, It is worth noting that in single-strand annealing, which is ostensibly Rad51 independent, a defect in completing repair and assuring viability is Srs2-dependent, but this defect is suppressed by deleting Rad51. Does deleting Rad51 have an effect here?

      Neither this paper nor the preceding one makes clear what really is the consequence of having a weaker-binding Rfa1 mutant. Is DSB repair altered? Neither CPT nor MMS are necessarily good substitutes for some true DSB assay.

      With camptothecin, in the absence of site-specific damage, it is difficult to test these questions directly. (Perhaps there is a way to assess the total amount of RPA bound, but ongoing replication may obscure such a measurement). It should be possible to assess how CPT treatment in various genetic backgrounds affects the duration of Mec1/Rad53-dependent checkpoint arrest, but more than a FACS profile would be required.

      It is also notable that MMS treatment does not seem to yield similar results (Fig. S1).

    4. Reviewer #3 (Public Review):

      The superfamily I 3'-5' DNA helicase Srs2 is well known for its role as an anti-recombinase, stripping Rad51 from ssDNA, as well as an anti-crossover factor, dissociating extended D-loops and favoring non-crossover outcome during recombination. In addition, Srs2 plays a key role in ribonucleotide excision repair. Besides DNA repair defects, srs2 mutants also show a reduced recovery after DNA damage that is related to its role in downregulating the DNA damage signaling or checkpoint response. Recent work from the Zhao laboratory (PMID: 33602817) identified a role of Srs2 in downregulating the DNA damage signaling response by removing RPA from ssDNA. This manuscript reports further mechanistic insights into the signaling downregulation function of Srs2.

      Using the genetic interaction with mutations in RPA1, mainly rfa1-zm2, the authors test a panel of mutations in Srs2 that affect CDK sites (srs2-7AV), potential Mec1 sites (srs2-2SA), known sumoylation sites (srs2-3KR), Rad51 binding (delta 875-902), PCNA interaction (delta 1159-1163), and SUMO interaction (srs2-SIMmut). All mutants were generated by genomic replacement and the expression level of the mutant proteins was found to be unchanged. This alleviates some concern about the use of deletion mutants compared to point mutations. The double mutant analysis identified that PCNA interaction and SUMO sites were required for the Srs2 checkpoint dampening function, at least in the context of the rfa1-zm2 mutant. There was no effect of these mutants in a RFA1 wild-type background. This latter result is likely explained by the activity of the parallel pathway of checkpoint dampening mediated by Slx4, and genetic data with an Slx4 point mutation affecting Rtt107 interaction and checkpoint downregulation support this notion. Further analysis of Srs2 sumoylation showed that Srs2 sumoylation depended on PCNA interaction, suggesting sequential events of Srs2 recruitment by PCNA and subsequent sumoylation. Kinetic analysis showed that sumoylation peaks after maximal Mec1 induction by DNA damage (using the Top1 poison camptothecin (CPT)) and depended on Mec1. These data are consistent with a model that Mec1 hyperactivation is ultimately leading to signaling downregulation by Srs2 through Srs2 sumoylation. Mec1-S1964 phosphorylation, a marker for Mec1 hyperactivation and a site found to be needed for checkpoint downregulation after DSB induction did not appear to be involved in checkpoint downregulation after CPT damage. The data are in support of the model that Mec1 hyperactivation when targeted to RPA-covered ssDNA by its Ddc2 (human ATRIP) targeting factor, favors Srs2 sumoylation after Srs2 recruitment to PCNA to disrupt the RPA-Ddc2-Mec1 signaling complex. Presumably, this allows gap filling and disappearance of long-lived ssDNA as the initiator of checkpoint signaling, although the study does not extend to this step.

      Strengths

      (1) The manuscript focuses on the novel function of Srs2 to downregulate the DNA damage signaling response and provide new mechanistic insights.

      (2) The conclusions that PCNA interaction and ensuing Srs2-sumoylation are involved in checkpoint downregulation are well supported by the data.

      Weaknesses

      (1) Additional mutants of interest could have been tested, such as the recently reported Pin mutant, srs2-Y775A (PMID: 38065943), and the Rad51 interaction point mutant, srs2-F891A (PMID: 31142613).

      (2) The use of deletion mutants for PCNA and RAD51 interaction is inferior to using specific point mutants, as done for the SUMO interaction and the sites for post-translational modifications.

      (3) Figure 4D and Figure 5A report data with standard deviations, which is unusual for n=2. Maybe the individual data points could be plotted with a color for each independent experiment to allow the reader to evaluate the reproducibility of the results.

    5. Author response:

      eLife assessment:

      This manuscript reports valuable findings on the role of the Srs2 protein in turning off the DNA damage signaling response initiated by Mec1 (human ATR) kinase. The data provide solid evidence that Srs2 interaction with PCNA and ensuing SUMO modification is required for checkpoint downregulation. However, experimental evidence with regard to the model that Srs2 acts at gaps after camptothecin-induced DNA damage is currently lacking. The work will be of interest to cell biologists studying genome integrity but would be strengthened by considering the possible role of Rad51 and its removal. 

      We appreciate the editors and the reviewers for providing evaluation and helpful comments. As detailed below, we plan to adjust the writing and figures to address the points raised by the reviewers. We believe that these changes will improve the clarity of the work. Below is a summary of our plan to address the two main criticisms.

      (1) Regarding the sites of Srs2 action, our data support the conclusion that Srs2 removal of RPA is favored at a subset of ssDNA regions that have proximal PCNA, but not at sites lacking PCNA. A logical supposition for the former types of ssDNA regions includes ssDNA gaps and tails generated during DNA repair or replication, wherein PCNA can be loaded at the ssDNA-dsDNA junction with a 3’ DNA end. Examples of the latter type of ssDNA regions without proximal PCNA can form within negatively supercoiling regions or intact R-loops, both of which lack 3’ DNA end for PCNA loading. While we have stated this conclusion in the text, we highlighted ssDNA gaps as sites of Srs2 action in Discussion and in the model figure, which could be misleading. We will clarify our model, that is, Srs2 distinguishes among different types of ssDNA regions using PCNA proximity as a guide for RPA removal, and state that the precise nature of Srs2 action sites remain to be determined. Regardless, the feature of Srs2 revealed in this work provides a rationale for how it can remove RPA at subsets of ssDNA regions without unnecessary stripping of RPA at other sites.

      (2) While Rad51 removal is an important facet of Srs2 functions, it is not relevant to our current study based on the following observations and rationales.

      First, we have provided several lines of evidence to support the conclusion that Rad51 removal by Srs2 is separable from the Srs2-RPA antagonism (Dhingra et al., 2021). For example, while rad51∆ rescues the hyper-recombination phenotype of srs2∆ cells, it does not affect the hyper-checkpoint phenotype of srs2∆. Strikingly, rfa1-zm1/zm2 have the opposite effect. The differential effects of rad51∆ and rfa1-zm1/zm2 were also seen for the srs2-_ATPase dead allele (_srs2-K41A). For example, rfa1-zm2 rescued the hyper-checkpoint defect and the CPT sensitivity of srs2-K41A, while rad51∆ had neither effect.

      These and other data described in Dhingra et al suggest that Srs2’s effects on checkpoint vs. recombination are separable and that the Srs2-RPA antagonism during the DNA damage checkpoint is independent of Rad51.

      Second, our current work addresses which Srs2 features affect the Srs2-RPA antagonism during the DNA damage response and its implications. Given this antagonism is separable from Srs2 removal of Rad51, including Rad51 regulation would be distractive from the main points of this work.

      Third, in the current work, we began by examining all known regulatory and protein-protein interaction features of Srs2, including the Rad51 binding domain. Consistent with our conclusion summarized above based on the Dhingra et al study, deleting the Rad51 binding domain in Srs2 (srs2-∆Rad51BD) has no effect on rfa1-zm2 phenotype in CPT (Figure 2D). This is in sharp contrast to mutating the PCNA binding and the sumoylation sites of Srs2, which suppressed rfa1-zm2 for its CPT sensitivity and checkpoint abnormalities (Figure 2C). This data provides yet another evidence that Srs2 regulation of Rad51 is separable from the Srs2-RPA antagonism. 

      In summary, our work provides a foundation for future examination of how Srs2 regulates RPA and Rad51 in different manners, how these two facets of the Srs2 functions affect genome integrity in different capacity, and whether there is a crosstalk between them during certain DNA metabolism processes.

      Public Reviews:

      Reviewer #1:

      Overall, the data presented in this manuscript is of good quality. Understanding how cells control RPA loading on ssDNA is crucial to understanding DNA damage responses and genome maintenance mechanisms. The authors used genetic approaches to show that disrupting PCNA binding and SUMOylation of Srs2 can rescue the CPT sensitivity of rfa1 mutants with reduced affinity for ssDNA. In addition, the authors find that SUMOylation of Srs2 depends on binding to PCNA and the presence of Mec1. Noted weaknesses include the lack of evidence supporting that Srs2 binding to PCNA and its SUMOylation occur at ssDNA gaps, as proposed by the authors. Also, the mutants of Srs2 with impaired binding to PCNA or impaired SUMOylation showed no clear defects in checkpoint dampening, and in some contexts, even resulted in decreased Rad53 activation. Therefore, key parts of the paper would benefit from further experimentation and/or clarification.  

      We thank the reviewer for the positive comments on this work and address her/his remark regarding ssDNA gaps below in Major Comment #1. In addition, we detailed below our data and rationale in suggesting that the checkpoint dampening phenotype of srs2-∆PIM and -3KR (deficient for PCNA binding and sumoylation, respectively) is masked by redundant pathways. We further describe our plan to enhance the clarity of both text and model to address these points from the reviewer. 

      Major Comments 

      (1) The central model proposed by the authors relies on the loading of PCNA at the 3' junction of an ssDNA gap, which then mediates Srs2 recruitment and RPA removal. While several aspects of the model are consistent with the data, the evidence that it is occurring at ssDNA gaps is not strong. The experiments mainly used CPT, which generates mostly DSBs. The few experiments using MMS, which mostly generates ssDNA gaps, show that Srs2 mutants lead to weaker rescue in this context (Figure S1). How do the authors explain this discrepancy? In the context of DSBs, are the authors proposing that Srs2 is engaging at later steps of HRdriven DSB repair where PCNA gets loaded to promote fill-in synthesis? If so, is RPA removal at that step important for checkpoint dampening? These issues need to be addressed and the final model adjusted. 

      We appreciate the reviewer’s concern. Our conclusion is that Srs2 can be guided by PCNA to a subset of ssDNA regions for RPA removal, and that this Srs2 action is not favored at ssDNA regions with no proximal PCNA. It is important to note that CPT can produce both types of ssDNA regions. Besides ssDNA generated via DSB-associated recombinational repair, CPT can also lead to ssDNA gap formation upon excision repair and DNA-protein crosslink repair of trapped Top1 (Sun et al., 2020). ssDNA regions generated during these DNA repair processes often contain 3’ DNA end for PCNA loading, thus they can favor Srs2 removal of RPA. Another facet of CPT’s effects (besides DNA lesions) is depleting functional pool of Top1, thus causing topological stress and consequently increased levels of DNA supercoiling and R-loops (Koster et al., 2007, Petermann et al., 2022). ssDNA formed within the negatively supercoiled regions and in R-loops lacks 3’ DNA end unless it is cleaved by nucleases, thus these sites would be disfavored for Srs2 removal of RPA due to lack of PCNA loading. Our conclusion that ssDNA regions with nearby PCNA are preferred sites for Srs2 action provides a rationale for how Srs2 can remove RPA at certain ssDNA regions but minimize unnecessary stripping of RPA from other sites.

      We will clarify in Discussion that CPT can generate twp types of ssDNA regions as stated above, and that Srs2 could distinguish among them using PCNA proximity as a guide for RPA removal. While this conclusion was described in the text, we emphasized ssDNA gap as a Srs2 action site in the model. We will clarify that while this is a logical supposition, other types of ssDNAs with proximal PCNA could also be targeted by Srs2 and that our work paves the way to determine the precise nature of ssDNA regions for Srs2’s action. 

      The reasons for the less potent growth suppression of rfa1 mutants by srs2 alleles in MMS condition compared with CPT condition are unclear, but multiple possibilities should be considered, given that MMS and CPT affect checkpoint responses differently and that RPA and Srs2 affect growth in multiple ways. For example, while CPT only activates the DNA damage checkpoint, MMS additionally induces DNA replication checkpoint (Menin et al., 2018, Redon et al., 2003). It is thus possible that the Srs2-RPA antagonism is relatively more important for the DNA damage checkpoint than the DNA replication checkpoint. Further investigation of this possibility among others will shed light on differential suppressive effects seen in this work. We will include this discussion in the revised text.

      (2) The data in Figure 3 showing that Srs2 mutants reduce Rad53 activation in the rfa1-zm2 mutant are confusing, especially given the claim of an anti-checkpoint function for Srs2 (in which case Srs2 mutants should result in increased Rad53 activation). The authors propose that Rad53 is hyperactivated in rfa1-zm2 mutant because of compromised ssDNA protection and consequential DNA lesions, however, the effects sharply contrast with the central model. Are the authors proposing that in the rfa1-zm2 mutant, the compromised protection of ssDNA supersedes the checkpoint-dampening effect? Perhaps a schematic should be included in Figure 3 to depict these complexities and help the reader. The schematic could also include the compensatory dampening mechanisms like Slx4 (on that note, why not move Figure S2 to a main figure?... and even expand experiments to better characterize the compensatory mechanisms, which seem important to help understand the lack of checkpoint dampening effect in the Srs2 mutants) 

      Genetic interactions that involve partially defective alleles, multi-functional proteins, and redundant pathways are complex to comprehend. For example, a phenotype seen for the null allele may not be seen for partially defective alleles. In the context of this study, while srs2 null increased Rad53 activation (Dhingra et al., 2021), srs2-∆PIM and -3KR did not (Figure 3A-3B). However, srs2-∆PIM enhanced Rad53 activation when combined with another checkpoint dampening mutant slx4RIM, suggesting that defects of srs2-∆PIM can be compensated by Slx4 (Figure S2). Importantly, srs2-∆PIM and -3KR rescued rfa1-zm2’s checkpoint abnormality (Figure 3A3B), suggesting that Srs2 binding to PCNA and its sumoylation contribute to the Srs2-RPA antagonism in the DNA damage checkpoint response.

      A partially defective allele that impairs a specific function of a protein can be a powerful genetic tool even when it lacks a particular phenotype on its own. For example, a partially defective allele of the checkpoint protein Rad9 impairing its binding to gamma-H2A (rad9-K1088M) does not affect the G2/M checkpoint nor cause DNA damage sensitivity due to the compensation of other checkpoint factors (Hammet et al., 2007); however_, rad9-K1088M_ rescues the DNA damage sensitivity and persistent G2/M checkpoint of rtt107 and slx4 mutants, providing one of the evidences supporting a role of the Slx4-Rtt107 axis in removal of Rad9 from chromatin (via competing with Rad9 for gamma-H2A binding) (Ohouo et al., 2013).

      In order to highlight the checkpoint recovery process, the model in Figure 6 did not depict another consequence of the Srs2-RPA antagonism. In the presence of Srs2, DNA binding rfa1 mutants can lead to increased levels of DNA lesions and checkpoint, and these defects are rescued by lessening Srs2’s ability to strip RPA from DNA (Dhingra et al., 2021). We will modify the model in Figure 6 and its legend to clarify that the model depicts just one of the consequences of the Srs2 and RPA antagonism with a focus on the checkpoint recovery. We will also state these points more clearly in the Discussion. Further, a new schematic in Figure 3 as suggested by the reviewer will be added to outline the genetic relationship and interpretation. We will also follow reviewer’s suggestion to move Figure S2 to the main figures. Better characterizing the compensatory mechanisms among different checkpoint dampening pathways is very interesting but requires substantial amounts of work. While it is beyond the scope of the current study, it could be pursued in the future.

      (3) The authors should demarcate the region used for quantifying the G1 population in Figure 3B and explain the following discrepancy: By inspection of the cell cycle graph, all mutants have lower G1 peak height compared to WT (CPT 2h). However, in the quantification bar graph at the bottom, ΔPIM has higher G1 population than the WT. 

      We have added the description on how the G1 region of the FACS histogram was selected to derive the percentage of G1 cells in Figure 3B. Briefly, for samples collected for a particular strain, the G1 region of the “G1 sample” was used to demarcate the G1 region of the “CPT 2h” sample. Upon re-checking the included FACS profiles, we realized that a mutant panel and its datapoint were mistakenly put in the place for wild-type. We will correct this mistake. The conclusion remains that srs2-∆PIM and srs2-3KR improved rfa1-zm2 cells’ ability to exit G2/M, while they themselves do not show difference from the wild-type control for the percentage of G1 cells after 2hr CPT treatment. We will add statistics in figures to reflect this conclusion and adjust the order of strains shown in panel A and B to be consistent with each other.

      Reviewer #2:

      This is an interesting paper that delves into the post-translational modifications of the yeast Srs2 helicase and proteins with which it interacts in coping with DNA damage. The authors use mutants in some interaction domains with RPA and Srs2 to argue for a model in which there is a balance between RPA binding to ssDNA and Srs2's removal of RPA. The idea that a checkpoint is being regulated is based on observing Rad53 and Rad9 phosphorylation (so there are the attributes of a checkpoint), but evidence of cell cycle arrest is lacking. The only apparent delay in the cell cycle is the re-entry into the second S phase (but it could be an exit from G2/M); but in any case, the wild-type cells enter the next cell cycle most rapidly. No direct measurement of RPA residence is presented. 

      We thank the reviewer for the helpful comments. Previous studies have shown that CPT does not induce the DNA replication checkpoint, thus it does not slow down or arrest S phase progression; however, CPT does induce the DNA damage checkpoint, which causes a delay of G2/M cells to re-enter into the second cell cycle (Menin et al., 2018, Redon et al., 2003). Our result is consistent with previous findings, showing that CPT induces G2/M delay but not arrest. We will adjust the text to make this point clearer.

      We have previously reported chromatin-bound RPA levels in rfa1-zm2, srs2, and their double mutants, as well as in vitro ssDNA binding by wild-type and mutant RPA complexes (Dhingra et al., 2021). We found that Srs2 loss or its ATPase dead mutant led to 4-6 fold increase of RPA levels on chromatin, which was rescued by rfa1-zm2 (Dhingra et al., 2021). On its own, rfa1-zm2 did not cause defective chromatin association in our assays, despite modestly reducing ssDNA binding in vitro (Dhingra et al., 2021). This discrepancy could be due to a lack of sensitivity of chromatin fractionation assay in revealing moderate changes of RPA residence on DNA. Considering this, we decided to employ functional assays (Figure 2-3) that are more effective in identifying the Srs2 features pertaining to RPA regulation. 

      Strengths:

      Data concern viability assays in the presence of camptothecin and in the post-translational modifications of Srs2 and other proteins.

      Weaknesses:

      There are a couple of overriding questions about the results, which appear technically excellent. Clearly, there is an Srs2-dependent repair process here, in the presence of camptothecin, but is it a consequence of replication fork stalling or chromosome breakage? Is repair Rad51-dependent, and if so, is Srs2 displacing RPA or removing Rad51 or both? If RPA is removed quickly what takes its place, and will the removal of RPA result in lower DDC1-MEC1 signaling? 

      While Srs2 can affect both the checkpoint response and DNA repair in CPT conditions, the rfa1-zm2 allele, which affects the former but not the latter, role of Srs2, allows us to gain a deeper understanding of the former role (Dhingra et al., 2021). This role also appears to be critical for cell survival in CPT, since srs2∆ growth on CPT-containing media was greatly improved by rfa1-zm mutants (Dhingra et al., 2021). Building on this understanding, our current study identified two Srs2 features that could afford spatial and temporal regulations of RPA removal from DNA, thus providing a rationale for how cells can properly utilize this beneficial yet also dangerous activity. Study of Srs2-mediated repair in CPT conditions, either in Rad51-dependent or independent manner, before and after replication forks stall or DNA breaks, will require substantial efforts and can be pursued in the future. We will add this point to the revised manuscript.

      Moreover, it is worth noting that in single-strand annealing, which is ostensibly Rad51 independent, a defect in completing repair and assuring viability is Srs2-dependent, but this defect is suppressed by deleting Rad51. Does deleting Rad51 have an effect here? 

      We have shown in our previous paper (Dhingra et al., 2021). that rad51∆ did not rescue the hyper-checkpoint phenotype of srs2∆ cells in CPT condition (Dhingra et al., 2021), while rfa1-zm1 and -zm2 did (Dhingra et al., 2021). Such differential effects were also seen for the srs2 ATPase-dead allele (Dhingra et al., 2021). These and other data described in the Dhingra et al paper suggest that Srs2’s effects on checkpoint vs. recombination are separable at least in CPT condition, and that the Srs2-RPA antagonism in checkpoint regulation is not affected by Rad51 removal (unlike in SSA situation).

      Neither this paper nor the preceding one makes clear what really is the consequence of having a weakerbinding Rfa1 mutant. Is DSB repair altered? Neither CPT nor MMS are necessarily good substitutes for some true DSB assay. 

      In our previous report (Dhingra et al., 2021), we showed that the rfa1-zm mutants did not affect the frequencies of rDNA recombination, gene conversation, or direct repeat repair (Dhingra et al., 2021). Further, rfa1-zm mutants did not suppress the hyper-recombination phenotype of srs2∆, while rad51∆ did (Dhingra et al., 2021). In a DSB system, wherein the direct repeats flanking the break were placed 30 kb away from each other, srs2∆ led to hyper-checkpoint and lethality, both of which were rescued by rfa1-zm mutants (Dhingra et al., 2021). In this assay, rfa1-zm mutants themselves did not show sensitivity, suggesting the repair is largely proficient. Collectively, these data provide evidence to suggest that weaker DNA binding of Rfa1 does not have detectable effect on the recombinational repair assays examined thus far, rather it has a profound effect in Srs2-mediated checkpoint downregulation. In-depth studies of rfa1-zm mutations in the context of various DSB repair steps will be interesting to pursue in the future.

      With camptothecin, in the absence of site-specific damage, it is difficult to test these questions directly. (Perhaps there is a way to assess the total amount of RPA bound, but ongoing replication may obscure such a measurement). It should be possible to assess how CPT treatment in various genetic backgrounds affects the duration of Mec1/Rad53-dependent checkpoint arrest, but more than a FACS profile would be required. 

      Quantitative measurement of RPA residence time on DNA in cells and the duration of Mec1/Rad53-dependent checkpoint arrest will be very informative but requires further technology development. Our current work provides a foundation for such quantitative assessment.

      It is also notable that MMS treatment does not seem to yield similar results (Fig. S1). 

      Figure S1 showed that srs2-∆PIM and srs2-3KR had weaker suppression of rfa1-zm2 growth on MMS plates than on CPT plates. The reasons for the less potent growth suppression in MMS condition compared with CPT condition are unclear, but multiple possibilities should be considered, given that MMS and CPT affect checkpoint responses differently and that RPA and Srs2 affect growth in multiple ways. For example, while CPT only activates the DNA damage checkpoint, MMS additionally induces DNA replication checkpoint (Menin et al., 2018, Redon et al., 2003). It is thus possible that the Srs2-RPA antagonism is more important for the DNA damage checkpoint than the DNA replication checkpoint. Further investigation of this and other possibilities will provide clues to the differential suppressive effects seen in this work. We will include this discussion in the revised text.

      Reviewer #3:

      The superfamily I 3'-5' DNA helicase Srs2 is well known for its role as an anti-recombinase, stripping Rad51 from ssDNA, as well as an anti-crossover factor, dissociating extended D-loops and favoring non-crossover outcome during recombination. In addition, Srs2 plays a key role in ribonucleotide excision repair. Besides DNA repair defects, srs2 mutants also show a reduced recovery after DNA damage that is related to its role in downregulating the DNA damage signaling or checkpoint response. Recent work from the Zhao laboratory (PMID: 33602817) identified a role of Srs2 in downregulating the DNA damage signaling response by removing RPA from ssDNA. This manuscript reports further mechanistic insights into the signaling downregulation function of Srs2. 

      Using the genetic interaction with mutations in RPA1, mainly rfa1-zm2, the authors test a panel of mutations in Srs2 that affect CDK sites (srs2-7AV), potential Mec1 sites (srs2-2SA), known sumoylation sites (srs2-3KR), Rad51 binding (delta 875-902), PCNA interaction (delta 1159-1163), and SUMO interaction (srs2SIMmut). All mutants were generated by genomic replacement and the expression level of the mutant proteins was found to be unchanged. This alleviates some concern about the use of deletion mutants compared to point mutations. The double mutant analysis identified that PCNA interaction and SUMO sites were required for the Srs2 checkpoint dampening function, at least in the context of the rfa1-zm2 mutant. There was no effect of these mutants in a RFA1 wild-type background. This latter result is likely explained by the activity of the parallel pathway of checkpoint dampening mediated by Slx4, and genetic data with an Slx4 point mutation affecting Rtt107 interaction and checkpoint downregulation support this notion. Further analysis of Srs2 sumoylation showed that Srs2 sumoylation depended on PCNA interaction, suggesting sequential events of Srs2 recruitment by PCNA and subsequent sumoylation. Kinetic analysis showed that sumoylation peaks after maximal Mec1 induction by DNA damage (using the Top1 poison camptothecin (CPT)) and depended on Mec1. These data are consistent with a model that Mec1 hyperactivation is ultimately leading to signaling downregulation by Srs2 through Srs2 sumoylation. Mec1-S1964 phosphorylation, a marker for Mec1 hyperactivation and a site found to be needed for checkpoint downregulation after DSB induction did not appear to be involved in checkpoint downregulation after CPT damage. The data are in support of the model that Mec1 hyperactivation when targeted to RPA-covered ssDNA by its Ddc2 (human ATRIP) targeting factor, favors Srs2 sumoylation after Srs2 recruitment to PCNA to disrupt the RPA-Ddc2-Mec1 signaling complex. Presumably, this allows gap filling and disappearance of long-lived ssDNA as the initiator of checkpoint signaling, although the study does not extend to this step.

      Strengths 

      (1) The manuscript focuses on the novel function of Srs2 to downregulate the DNA damage signaling response and provide new mechanistic insights. 

      (2) The conclusions that PCNA interaction and ensuing Srs2-sumoylation are involved in checkpoint downregulation are well supported by the data. 

      We thank the reviewer for carefully reading our work and for his/her positive comments. 

      Weaknesses 

      (1) Additional mutants of interest could have been tested, such as the recently reported Pin mutant, srs2Y775A (PMID: 38065943), and the Rad51 interaction point mutant, srs2-F891A (PMID: 31142613). 

      srs2-Y775A was shown to be proficient for stripping RPA from ssDNA and behaved like wild-type Srs2 in assays such as gene conversion and crossover control, and exhibited a genetic interaction profile as the wildtype allele. The authors suggest that the Y775 pin can contribute to unwinding secondary DNA structures. Collectively, these findings do not provide a strong rationale for srs2-Y775A being relevant for RPA removal from ssDNA. 

      We have already included the data showing that a srs2 mutant lacking the Rad51 binding domain (srs2-∆Rad51BD, ∆875-902) did not affect rfa1-zm2 growth in CPT nor caused other defects in CPT on its own (Figure 2D). This data suggest that Rad51 binding is not relevant to the Srs2-RPA antagonism in CPT, a conclusion fully supported by data in our previous study (Dhingra et al., 2021). Collectively, these findings do not provide a strong rationale to test a point mutation within the Rad51BD region. 

      (2) The use of deletion mutants for PCNA and RAD51 interaction is inferior to using specific point mutants, as done for the SUMO interaction and the sites for post-translational modifications. 

      We agree with this view generally. However, this is less of a concern for the Rad51 binding site mutant (srs2∆Rad51BD), as it behaved as the wild-type allele in our assays. The srs2-∆PIM mutant (lacking 4 amino acids) has been examined for PCNA binding in vitro and in vivo in several studies (e.g. Kolesar et al., 2016, Kolesar et al., 2012); to our knowledge no unintended defect was reported. We thus believe that this allele is suitable for testing whether Srs2’s ability to bind PCNA is relevant to RPA regulation.

      (3) Figure 4D and Figure 5A report data with standard deviations, which is unusual for n=2. Maybe the individual data points could be plotted with a color for each independent experiment to allow the reader to evaluate the reproducibility of the results. 

      We will include individual data points as suggested and correct figure legend to indicate that three independent biological samples per genotype were examined in both panels.

      References:

      Dhingra N, Kuppa S, Wei L, Pokhrel N, Baburyan S, Meng X, Antony E and Zhao X (2021) The Srs2 helicase dampens DNA damage checkpoint by recycling RPA from chromatin Proc Natl Acad Sci U S A 118

      Hammet A, Magill C, Heierhorst J and Jackson SP (2007) Rad9 BRCT domain interaction with phosphorylated H2AX regulates the G1 checkpoint in budding yeast EMBO Rep 8: 851-857

      Kolesar P, Altmannova V, Silva S, Lisby M and Krejci L (2016) Pro-recombination Role of Srs2 Protein Requires SUMO (Small Ubiquitin-like Modifier) but Is Independent of PCNA (Proliferating Cell Nuclear Antigen) Interaction J Biol Chem 291: 7594-7607

      Kolesar P, Sarangi P, Altmannova V, Zhao X and Krejci L (2012) Dual roles of the SUMO-interacting motif in the regulation of Srs2 sumoylation Nucleic Acids Res 40: 7831-7843

      Koster DA, Palle K, Bot ES, Bjornsti MA and Dekker NH (2007) Antitumour drugs impede DNA uncoiling by topoisomerase I Nature

      448: 213-217

      Menin L, Ursich S, Trovesi C, Zellweger R, Lopes M, Longhese MP and Clerici M (2018) Tel1/ATM prevents degradation of replication forks that reverse after topoisomerase poisoning EMBO Rep 19

      Ohouo PY, Bastos De Oliveira FM, Liu Y, Ma CJ and Smolka MB (2013) DNA-repair scaffolds dampen checkpoint signalling by counteracting the adaptor Rad9 Nature 493: 120-124

      Petermann E, Lan L and Zou L (2022) Sources, resolution and physiological relevance of R-loops and RNA-DNA hybrids Nat Rev Mol Cell Biol 23: 521-540

      Redon C, Pilch DR, Rogakou EP, Orr AH, Lowndes NF and Bonner WM (2003) Yeast histone 2A serine 129 is essential for the efficient repair of checkpoint-blind DNA damage EMBO Rep 4: 678-684

      Sun Y, Saha S, Wang W, Saha LK, Huang SN and Pommier Y (2020) Excision repair of topoisomerase DNA-protein crosslinks (TOP-

      DPC). DNA Repair 89: 102837

    1. Author response:

      The following is the authors’ response to the previous reviews.

      eLife assessment

      The authors use point light displays to measure biological motion (BM) perception in children (mean = 9 years) with and without ADHD, and relate it to IQ, social responsiveness scale (SRS) scores and age. They report that children with ADHD were worse at all three BM tasks, but that those tasks loading more heavily on local processing relate to social interaction skills and those loading on global processing relate to age. There are still some elements of the results that are unclear, but nevertheless, the important and solid findings extend our limited knowledge of BM perception in ADHD, as well as biological motion processing mechanisms in general.

      We thank the editors and reviewers for their valuable feedback and constructive comments. In the revised manuscript, we have incorporated all statistics for the models and also provided detailed analytical evidence about the distinct contributions of local and global BM processing. We hope these clarifications could enhance the robustness of our conclusions.

      Public Reviews:

      Reviewer #2 (Public Review):

      Summary:

      Tian et al. aimed to assess differences in biological motion (BM) perception between children with and without ADHD, as well as relationships to indices of social functioning and possible predictors of BM perception (including demographics, reasoning ability and inattention). In their study, children with ADHD showed poorer performance relative to typically developing children in three tasks measuring local, global, and general BM perception. The authors further observed that across the whole sample, performance in all three BM tasks was negatively correlated with scores on the social responsiveness scale (SRS), whereas within groups a significant relationship to SRS scores was only observed in the ADHD group and for the local BM task. Local and global BM perception showed a dissociation in that global BM processing was predicted by age, while local BM perception was not. Finally, general (local & global combined) BM processing was predicted by age and global BM processing, while reasoning ability mediated the effect of inattention on BM processing.

      Strengths:

      Overall, the manuscript is presented in a clear fashion and methods and materials are presented with sufficient detail so the study could be reproduced by independent researchers. The study uses an innovative, albeit not novel, paradigm to investigate two independent processes underlying BM perception. The results are novel and have the potential to have wide-reaching impact on multiple fields.

      We appreciate the your positive feedback very much.

      Weaknesses:

      The manuscript has improved in clarity and conceptual and methodological considerations in response to the last review. However, the reported results still provide incomplete support for the claims the authors make in the paper.

      In relation to other reviewers' earlier comments, the model notation used is still not consistent and model results are reported incompletely, which make it difficult to gain a full picture of the data and how they support the authors' secondary claims. For instance, across the models in the supplementary materials, ß coefficients are only reported selectively which makes it difficult to assess the model as a whole. Furthermore, different terms (task 1, task 2 vs. BM-Local, BM-global) are used to refer to the same levels of a variable, and it is unclear which levels of a dummy variable correspond to which task, making it overall very difficult to comprehend the modelling procedure.

      Thanks for pointing out these issues. In the revised version, we have unified the terminology by consistently referring to task types as BM-Local, BM-Global, BM-General. Additionally, we have provided clarification on the interpretation of dummy variables in relation to model construction. Furthermore, we corrected the model results and included all statistics in Table S1, S2, and S3. For more detailed information, please refer to the response to your Recommendations for the authors.

      Reviewer #3 (Public Review):

      The authors presented point light displays of human walkers to children (mean = 9 years) with and without ADHD to compare their biological motion perception abilities, and relate them to IQ, social responsiveness scale (SRS) scores and age. They report that children with ADHD were worse at all three biological motion tasks, but that those loading more heavily on local processing related to social interaction skills and global processing to age. The valuable and solid findings are informative for understanding this complex condition, as well as biological motion processing mechanisms in general. However, the correlations present a pattern that needs further examination in future studies because many of the differences between correlations are not significant.

      Strengths:

      The authors present differences between ADHD and TD children in biological motion processing, and this question has not received as much attention as equivalent processing capabilities in autism. They use a task that appears well controlled. They raise some interesting mechanistic possibilities for differences in local and global motion processing, which are distinctions worth exploring. The group differences will therefore be of interest to those studying ADHD, as well as other developmental conditions, and those examining biological motion processing mechanisms in general.

      Thanks for this positive assessment of our work.

      Weaknesses:

      The data are not strong enough to support claims about differences between global and lobal processing wrt social communication skills and age. The mechanistic possibilities for why these abilities may dissociate in such a way are interesting, but the crucial tests of differences between correlations do not present a clear picture. Further empirical work would be needed to test this further. Specifics:

      The authors state frequently that it was the local BM task that related to social communication skills (SRS) and not the global tasks. However, the results section shows a correlation between SRS and all three tasks. The only difference is that when looking specifically within the ADHD group, the correlation is only significant for the local task. The supplementary materials demonstrate that tests of differences between correlations present an incomplete picture. Currently they have small samples for correlations, so this is unsurprising.

      We apologize for not clarifying these points earlier. We did identify correlations between performance on all BM tasks and SRS scores. However, it is noteworthy that this finding is not unexpected, given the significant distinctions in SRS scores between TD and ADHD children, alongside their marked differences in all BM tasks. Correlation analyses involving data from both groups may reflect group differences. To elucidate the relationship between social ability impairment and diminished BM processing in children with ADHD, we conducted additional subgroup analyses and found correlations only in the BM-local task. To further support the specificity of this correlation, we compared the differences in coefficients. We revised our modelling procedure for testing differences between correlations in supplementary materials and presented all models statistics in Table S2, S3. Discrepancies in these coefficients, which exclude the influence of differences between groups, suggest that social factors specifically influence the performance of the BM-Local task in children with ADHD. We acknowledge that the analysis for differences between correlations is based on a relative small sample size and provided modest interpretation in discussion. Future studies will aim to increase the sample size to validate our findings.

      Theoretical assumptions. The authors make some statements about local vs global biological motion processing that may have been made in previous studies, but would appear controversial and not definitive. E.g., that local BM processing does not improve with age and is uninfluenced by attention.

      Thanks for your comment. To the best of our knowledge, there have been fewer developmental studies conducted on local BM processing compared to global BM processing. Our study is the first one to directly explore the relationship between local BM processing and age. Additionally, we used QbInattention to evaluate sustained attention function (considered as “top-down” attention) and examined its correlation with local BM processing. Some indirect evidence supported that the ability to process local BM cues remained stable and was unaffected by top-down attention. For example, local BM processing did not show a learning trend (Chang 2009) and was linked to the activation of subcortical regions (Hirai 2020). Research has demonstrated that local BM cues can convey information about walking direction without participants’ explicit attention or recognition (Chang 2009, Hirai 2011, Thompson 2007, Wang 2010), indicating the involvement of “bottom-up” processing (Hirai 2020, Troje 2023). Consistent with previous findings, we did not find significant correlation between local BM processing and age or QbInattention. We acknowledge that the statement such as “local BM processing does not improve with age and is uninfluenced by attention” should be approached with cautions. Therefore, we interpreted our results carefully:

      “Once a living creature is detected, an agent (i.e., is it a human?) can be recognised by a coherent, articulated body structure that is perceptually organised based on its motions (i.e., local BM cues)71. This involves top-down processing and probably requires attention25,72, particularly in the presence of competing information26. Our findings are consistent with those of previous studies on the cortical processing of BM73, as we found that the severity of inattention in children with ADHD was negatively correlated with their performance in global BM processing, whereas this significant correlation was not found in local BM processing, which may involve bottom-up processing61,65 and might not need participants’ explicit attention21,23,74,75. However, further studies are needed to verify this hypothesis.” (lines 461-470)

      Recommendations for the authors:

      Reviewer #2 (Recommendations For The Authors):

      Supplementary materials: For all reported results, I suggest the authors use consistent model notation with complete reporting of all statistics in line with common conventions (ideally tables reporting beta values, error terms and confidence intervals for all model predictors, as well as R squared values). In particular the beta values for the reference category are needed to be able to fully interpret the beta values for the reported contrasts.

      We appreciate the your suggestion. In the newly revised manuscript, we reported all statistics including beta values, error terms and confidence intervals for all model predictors, and R squared values. These detailed statistics can be found in Table S1, S2 and S3. We hope this additional information will offer readers a more comprehensive understanding of our study.

      Please also address the following inconsistencies:

      - At least when reporting the model results, the same term should be used when refering to task type (either task 1/2/3/ or local/global/general BM).

      Thank the your for this feedback. We use the same term (BM-Local/Global/General) to refer to task type in the whole text.

      - Second linear model in the Supplementary Materials: The authors state that the results suggest that the correlation between SRS and task 1 is greater than that between task 2 and SRS scores. First of all, to be able to support this claim the authors need to provide the coefficient for task 1 (which, if task 1 is the reference variable should be ß1). Second, as I currently understand the reported model results, the fact that ß4 (representing the difference in relationship to SRS scores between task 2 and task 1; the authors refer to ß3 here although I assume they mean ß4) is negative and shows a trend towards significance would actually mean the relationship between BM processing accuracy and SRS scores is more negative for task 2 relative to task 1 and not, as the authors state, that the correlation with SRS scores is greater for task 1. I realise this contradicts the individual r values and scatter plots and hope the authors can clarify the model results.

      We thank you for pointing out these issues. For the second linear model (Model 4 in revised manuscript), we reported the coefficients for all predictors and model summaries including the coefficient for task 1 (ß1). In addition, we have made correction to the model results. The values of ß4 (representing the difference in relationship to SRS scores between BM-Global and BM-Local) and ß5 (representing the difference in relationship to SRS scores between BM-General and BM-Local) were positive and showed a trend towards significance, indicating that the correlations with SRS total score were more negative for BM-Local relative to BM-Global and BM-General:

      “A general linear model was constructed (Table S2, Model 4): SRS = β0 + β1 * ACC + β2 * D1 + β3 * D2 + β4 * (ACC * D1) + β5 * (ACC * D2). If the effect of the interaction term (i.e., β4 or β5 ) is statistically significant, it indicates a difference in correlations with SRS total score between BM-Local and BM-Global (or BM-General). The results suggested trends where the correlations with SRS total score were more negative for BM-Local relative to BM-Global (standardized β4 \= 0.580 p = 0.074) and BM-General (standardized β5 = 0.550 p = 0.073).” (lines SI 36-42)

      - Third linear model in the Supplementary Materials: In the dummy variable representing task, when local BM is the reference level, which task is represented by d1 and d2, respectively? If I understand the authors' procedure correctly, d1 should represent the difference between local and global BM and d2 the difference between local and general BM. If this is true, ß4 should code for the difference between local and global BM and not, as stated by the authors, for the difference between local and general BM. Also, what is d3?

      Thank you for pointing out this issue. We corrected and clarified the results of third model (Model 5 in revised manuscript) in the revised version and pointed out what is represented by d1 (D1) and d2 (D2), respectively:

      “We recoded task types into two dummy variables, D1 and D2, using BM-Local as a reference. The coefficient of D1 represents the difference in relationship to age between BM-Local and BM-Global, and the coefficient of D2 represents the difference in relationship to age between BM-Local and BM-General. The following model was created for each group (Table S3, Model 5-6): ACC = β0 + β1 * age + β2 * D1 + β3 * D2 + β4 * (age * D1) + β5 * (age * D2). If the effect of the interaction term (i.e., β4 or β5) is statistically significant, it indicates a difference in the effect of age on ACC between BM-Local and BM-Global (or BM-General). In the ADHD group, we observed a significant difference in the effect of age on ACC between BM-Local and BM-General (standardized β5 \= 0.462, p < 0.001) and marginally significant differences in the effect of age on ACC between BM-Local and BM-Global (standardized β4 \= 0.228, p = 0.073).” (lines SI 47-57)

    2. eLife assessment

      The authors use point light displays to measure biological motion (BM) perception in children (mean = 9 years) with and without ADHD, and relate it to IQ, social responsiveness scale (SRS) scores and age. They report that children with ADHD were worse at all three BM tasks, but that those tasks loading more heavily on local processing relate to social interaction skills and those loading on global processing relate to age. There are still some elements of the results that need clarification with future work, but nevertheless, the important and solid findings extend our limited knowledge of BM perception in ADHD, as well as biological motion processing mechanisms in general.

    3. Reviewer #2 (Public Review):

      Summary:

      Tian et al. aimed to assess differences in biological motion (BM) perception between children with and without ADHD, as well as relationships to indices of social functioning and possible predictors of BM perception (including demographics, reasoning ability and inattention). In their study, children with ADHD showed poorer performance relative to typically developing children in three tasks measuring local, global, and general BM perception. The authors further observed that across the whole sample, performance in all three BM tasks was negatively correlated with scores on the social responsiveness scale (SRS), whereas within groups a significant relationship to SRS scores was only observed in the ADHD group and for the local BM task. Local and global BM perception showed a dissociation in that global BM processing was predicted by age, while local BM perception was not. Finally, general (local & global combined) BM processing was predicted by age and global BM processing, while reasoning ability mediated the effect of inattention on BM processing.

      Strengths:

      Overall, the manuscript is presented in a clear fashion and methods and materials are presented with sufficient detail so the study could be reproduced by independent researchers. The study uses an innovative, albeit not novel, paradigm to investigate two independent processes underlying BM perception. The results are novel and have the potential to have wide-reaching impact on multiple fields.

      Weaknesses:

      The manuscript has improved in clarity and conceptual and methodological considerations in response to the last review. However, the reported results still provide incomplete support for the claims the authors make in the paper, due to differences between correlations not passing significance thresholds.

    4. Reviewer #3 (Public Review):

      The authors presented point light displays of human walkers to children (mean = 9 years) with and without ADHD to compare their biological motion perception abilities, and relate them to IQ, social responsiveness scale (SRS) scores and age. They report that children with ADHD were worse at all three biological motion tasks, but that those loading more heavily on local processing related to social interaction skills and global processing to age. The valuable and solid findings are informative for understanding this complex condition, as well as biological motion processing mechanisms in general. However, the correlations present a pattern that needs further examination in future studies because many of the differences between correlations are not significant.

      Strengths:

      The authors present differences between ADHD and TD children in biological motion processing, and this question has not received as much attention as equivalent processing capabilities in autism. They use a task that appears well controlled. They raise some interesting mechanistic possibilities for differences in local and global motion processing, which are distinctions worth exploring. The group differences will therefore be of interest to those studying ADHD, as well as other developmental conditions, and those examining biological motion processing mechanisms in general.

      Weaknesses:

      The data are not strong enough to support claims about differences between global and lobal processing wrt social communication skills and age. The mechanistic possibilities for why these abilities may dissociate in such a way are interesting, but the crucial tests of differences between correlations do not present a clear picture. Further empirical work would be needed to test this further. Specifics:

      The authors state frequently that it was the local BM task that related to social communication skills (SRS) and not the global tasks. However, the results section shows a correlation between SRS and all three tasks. The only difference is that when looking specifically within the ADHD group, the correlation is only significant for the local task. The supplementary materials demonstrate that tests of differences between correlations present an incomplete picture. Currently they have small samples for correlations, so this is unsurprising.

      Theoretical assumptions. The authors make some statements about local vs global biological motion processing that may have been made in previous studies, but would appear controversial and not definitive. E.g., that local BM processing does not improve with age.

    1. eLife assessment

      This valuable paper explores the role of translational regulation in the establishment of differential gene expression between neurons and glia in Drosophila. The paper uses Ribo-seq to show extensive variation in the translation efficiency of specific transcripts between neurons and glia. The evidence supporting the model is solid, although only one example (that exhibits very strong differential transcriptional expression between one class of neurons and glia) is studied in detail for translation efficiency.

    2. Reviewer #1 (Public Review):

      This study seeks to understand how selective mRNA translation informs cellular identity using the Drosophila brain as a model. Using drivers specific for either neurons or glia, the authors express a tagged large ribosomal subunit protein, which they then use as a handle for isolating total mRNA and ribosome footprints. Throughout the study, they compare these data sets to transcriptional and ribosome profiles from the whole fly head, which contains multiple cell types including fat tissue, pigment cells and others, in addition to neurons and glia. Using GO term analyses, they demonstrate the specificity of their cell-type-based ribosome profiling: known glial mRNAs are efficiently translated in glia and likewise in neurons as well. In further examining their RNAseq data set, they find that "neuronal" mRNAs, such as ion channels, are expressed in both neurons and glia, but are translated at higher rates in neurons. Based on this, they hypothesize that neuronal mRNAs are actively suppressed in glia, and next seek to determine the underlying mechanism. By meta-analysis of all mapped ribosome footprints, they find that glia have higher ribosome occupancies in the 5' leader of neuronal mRNAs. This is corroborated by individual ribosome occupancy profiles for several neuronal mRNAs. In 5'leaders containing upstream AUG codons, they find that the glial data sets show an enrichment of ribosomes at these upstream start sites. They thus conclude that that 5' leaders containing upstream AUGs confer translational suppression in glia.

      Overall, the sequencing data sets generated in this study and their subsequent bioinformatic analyses seem robust and reliable. Their data echo the trends of cell-type specific translational profiles seen in previous studies (e.g. 27380875, 30650354), and making their data sets and analyses accessible to the broader scientific community would be quite helpful. The findings are presented in a logical and methodical manner, and the data are depicted clearly. The authors' results that 5' leaders facilitate translation suppression is well-supported in literature. However, they overinterpret their data by claiming that such suppression is key for maintaining glial/neuronal identity (it is even featured in their title), but do not present any evidence that loss of such regulation has any impact on cellular identity. In many places, the authors do not acknowledge possible biases in their analytical methods, or consider alternate explanations for their data. These weaken the manuscript in its current form, but many of these issues which I describe below, are rectifiable with modest effort.

      (1) The authors' data in Fig. 2-S1A-B shows substantial cell-to-cell variation in RpL3::FLAG expression. The authors do not consider that this variation may cause certain neuronal/glial types to be overrepresented in their datasets. In related, the authors do not discuss whether RpL3::FLAG only present in the cell body or if it is also trafficked to the neuronal/glial processes where localized translation is known to occur (reviewed in 31270476).

      (2) The RNA-seq data set that they use to calculate translation efficiency (TE) only represents mRNAs associated with RpL3::FLAG, which is part of the large ribosome subunit. As the authors are likely aware, there are mRNAs on which the full ribosome moiety does not assemble and these are effectively excluded from this data set. Ideally, a more complete picture of the mRNA landscape can be obtained by 40S subunit profiling but I appreciate that this is technically very challenging. At minimum, this caveat needs to be acknowledged.

      How does the TPM of differentially regulated transcripts (such as those in Fig. 2H) compare between whole heads, neurons and glia? Since the whole head RNA-seq data was not from an enriched sample, this might serve as a decent proxy for showing that the neuron/glia RNA-seq data sets are representative of RNA abundance.

      (3) The analysis in Fig. 2F shows that low abundance mRNAs in glia are further translationally suppressed, which the authors point out in lines 151-152. However, this data also shows that mRNAs with a 1:1 ration in neuron:glia (which fall in the 0.5-1 and 1-2 bin) have a TE-1; this suggests that on average, mRNAs that are equally abundant are translated equally efficiently. This is the opposite of the thesis presented in Fig. 2G-H where many mRNAs of equal abundance in neurons and glia are actually poorly translated in glia. How do the authors reconcile these observations?

      It is also unclear from the manuscript whether all mRNAs were considered for the analysis in Fig. 2F or if some cutoff was employed.

      (4) Throughout the manuscript the authors favor a "translation suppression" model wherein glia (for example) actively suppress neuronal mRNAs, and this is substantiated in Fig. 3C showing higher ribosome occupancy on 5' leaders than in coding regions. However, they show no evidence that glial mRNAs (such as those indicated in Fig. 2B and 2-S2B) present a different pattern, say that of higher ribosome occupancy in CDS vs. 5' leaders. This type of a positive control is a glaring omission from many of their analyses, including ribosome occupancy at upstream AUG codons (Fig. 4).

      In related, to make a broad case (as they do in the title) that differential translation regulation specifies multiple cell types, it is necessary to show the corollary: that glial mRNAs (repo, bnb, pnt, etc) are suppressed in neurons. There is an inkling of this evidence in Fig. 3-S1 where fat body mRNAs in neurons are shown to have low ribosome occupancy in the CDS regions and enhanced occupancy in the 5' leader region. This data is not quantified, nor is a control neuron mRNA shown as a reference for what the ribosome occupancy profile of an actively translated mRNA looks like in a neuron.

      (5) The cell-type specific ribosome profiling data sets in the manuscript are from mRNAs associated with 80s subunits that have been treated with cycloheximide during sample preparation. Cycloheximide, and many other translation inhibitors, are known to non-uniformly bias reads towards start codons (PMID: 22056041,22927429). This important caveat and its implications on the start-codon occupancy analysis in Fig. 4 are not acknowledged in the manuscript.<br /> Again, the ideal resolution would be ribosome profiling data set from 40S footprinting or harringtonine-treated samples (PMIDs: 32589966, 27487212, 32589964) to show true accumulation of ribosomes at AUG codons. In the absence of such a data set, a comparative meta-analysis of the ribosome distribution around upstream and initiation AUG codons of differentially translated transcripts from neurons would be a useful control.

      (6) The authors chose Rhodopsin 1 (Rh1) as a model mRNA which is translated efficiently in neurons but suppressed in glia. Though the data in Fig. 2-S3B shows higher TE for Rh1 in neurons, the data in 5A show lower ribosome occupancy in the Rh1 CDS in neuron samples (at least in the fragment of the CDS visible). These data are somewhat contradictory.<br /> Further, given that the neuron data are from all nsyb-positive cells but that Rh1 is expressed only in R1-R6 photoreceptors, it is unclear what motivated them to chose Rh1 as opposed to an mRNA that is more broadly expressed in neurons.

      (7) Similar to the heterogeneity in nsyb- and repo-GAL4 expression in Fig. 2-S1A-B, Fig. 5C shows substantial variation in the expression of the UAS-GFP reporter driven by tub-GAL4. This variable GAL4 activity makes the mRNA abundance data difficult to interpret. Also, since the authors presume that Rh1 mRNA is expressed in glia (it is not annotated in the RNA-seq analysis in Fig. 2-S2B), would Rh1-GAL4 not be a more apt driver?<br /> These issues are further compounded by the lack of a cellular compartment marker (repo marks glial nuclei) which makes it impossible to determine which cell the mRNA signal is in. There are also no negative controls are presented for the mRNA probes.

      Most confoundingly though, the control reporter itself seems to show variable translation efficiencies from one cell to another, with high-GFP protein cells showing lower GFP mRNA and vice versa.<br /> The mRNA:protein ratio may be easier to examine by using repo-GAL4 to specifically drive the Rh1-reporter expression in glia (such as in Fig. 5-S1A) rather than simultaneous expression in both neurons and glia using tub-GAL4.

      Comments post revision: The authors have satisfactorily addressed most of my concerns with the study. I appreciate their patient clarification of many of my points, and the revision to text+figures appending more controls. My only minor gripe remains that while their data beautifully show that there is differential regulation of transcripts across neurons and glia, they do not provide evidence that such regulation is required for cell identity. However, I appreciate this is a large experimental ask worthy of another study in and of itself. Overall, I peg this an excellent study that adds substantially to the field of cell-type specific mRNA translation regulation.

    3. Reviewer #3 (Public Review):

      It is well established that there is extensive post-transcriptional gene regulation in nervous systems, including the fly brain. For example, dynamic regulation of hundreds of genes during photoreceptor development could only be observed at the level of translated mRNAs, but not the entire transcriptomes. The present study instead addresses the role of differential translational regulation between cell types (or rather classes: neurons and glia, as both are still highly heterogenous groups) in the adult fly brain. By performing bulk RNA-seq and Ribo-seq on the same lysates, the authors are able to compare translation efficiency (TE) of all transcripts between neurons and glia. Many genes display differential TE, but interestingly, they tend to be the genes that already show strong differences at their mRNA level. The most striking observation is the finding that neuronal transcripts in glia display increased ribosome stalling at their 5' UTR, and in particular at the start codons of short "upstream ORFs". This could suggest that glia specifically employ a mechanism to upregulate upstream ORF translation, enabling them to better suppress the expression of the genes that have them. And neuronal genes tend to have longer 5' UTRs, perhaps to facilitate this type of regulation.

      However, it is difficult to evaluate the functional significance of these differences because the authors provide only one follow-up experiment to their RNA-seq analysis. Venus expressed with the Rh1 UTR sequences may be displaying differential levels between glia and neurons, but I find this image (Fig. 5C) rather unconvincing to support that conclusion. There are no quantifications of colocalization, or even sample size information provided for this experiment. And if there is indeed a difference, it would still be difficult argue this is because of the 5' stalling phenomenon authors observe with Rh1, because they switched both the 5' and 3' UTRs.

      I also find it puzzling that the TE differences between the groups are mostly among the transcripts that are already strongly differentially expressed at the transcriptional level. The authors would like to frame this as a mechanism of 'contrast sharpening'; but it is unclear why that would be needed. Rh1, for instance, is not just differentially expressed between neurons and glia, but it is actually only expressed by a very specific neuronal type (photoreceptors). Thus it's not clear to me why the glia would need this 5' stalling mechanism to fully suppress Rh1 expression, while all the other neurons can apparently do so without it.

      Response to authors' revisions:

      The authors have addressed most of the technical points in their revised manuscript. However, it is still rather unclear whether this mechanism would have any significant impact on differential gene expression between cell types in vivo. Considering that it's mostly occurring on genes that are already strongly differentially transcribed, that doesn't appear very likely.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer 3:

      Response to authors' revisions:

      This reviewer is not convinced that the authors have done enough to satisfactorily address either of the major issues described in the original public review, above.

      They're still not providing a quantification of Fig. 5D (originally 5C).

      Their response regarding the expression pattern of Rh1 is particularly concerning, as it represents a misinterpretation of previously published data.

      The gene encoding Rh1, ninaE, is expressed at such high levels in R1-6 PRs that any RNA-seq data (bulk or single-cell) generated from the optic lobes, no matter what cell-type, will display some ninaE transcripts that are present in the background, as they leak from R1-6 during dissociation steps. This phenomenon has been well described, for instance in Davis et al., 2020, eLife, and in fact led to the development of computational tools to abate such artifacts. In other words: no, rh1 is not expressed in glia, or any other neuron besides PRs for that matter. Therefore, I remain deeply suspicious about the functional relevance of the regulatory mechanisms described in this paper.

      We thank the reviewer for her or his critical comments.

      We quantified the cell-type differences in translation of the reporter with Tub-GAL4 and now show the results in Figure 5F. Consistent with other results, this analysis revealed that the glia-to-neuron ratio of the reporter protein expression is significantly lower when it contains the UTR sequences of rh1.  

      We removed the mRNA counts (former Figure 5A and Figure 5 - figure supplement 1A), as we agree that these may well be contaminated by the very high rh1 expression in R1-6. We also amended the graph showing the ribosome distribution on the rh1 mRNA (Figure 5B) to better compare the translational efficiency (footprints normalized with mRNA, in a similar manner to Figure 3C). Now it clearly highlights the cell-type differences of footprint distributions; ribosomes are much more enriched on the CDS (being translated) in neurons, while the fraction of ribosomes on the 5ʹ leader (being stalled) is much higher in glia. We summarized this differential ribosome distribution in a new graph (now Figure 5C).  

      We apologize for the misleading description of the reporter experiments. Despite the high level of mRNA expression in the R1-6, we chose the 5ʹ leader of rh1 for the translation reporter, as it contains clear uORFs and differential ribosome accumulation thereon (Figure 5B). This biased ribosome distribution and differential translation are the consistent features for many neuronal genes (Figure 3). We revised the text to clarify this point (Line 195-203).

      In summary, we provide more rigorous analysis and extensive revision, which we hope clarified the concern.

    1. Reviewer #3 (Public Review):

      Summary:

      The authors elucidated the role of USP8 in the endocytic pathway. Using C. elegans epithelial cells as a model, they observed that when USP8 function is lost, the cells have a decreased number and size in lysosomes. Since USP8 was already known to be a protein linked to ESCRT components, they looked into what role USP8 might play in connecting lysosomes and multivesicular bodies (MVB). They observed fewer ESCRT-associated vesicles but an increased number of "abnormal" enlarged vesicles when USP8 function was lost. Then they observed that the abnormally enlarged vesicles, marked by the PI3P biosensor YFP-2xFYVE, are bigger but in the same number in USP8 (-) compared to wild-type animals, suggesting homotypic fusion. They confirmed this result by knocking down USP8 in a human cell line, and they observed enlarged vesicles marked by YFP-2xFYVE as well. They finally propose that USP8 dissociates Rabx-5 from early endosomes facilitating endosome maturation.

      Strengths:

      The authors have created significant, multifaceted tools for investigating systems involved in endosome dynamics control in both worm and human cells, which will help many members of the cell biology community. The study discovered an intriguing relationship between USP8 and the Rab5 guanine nucleotide exchange factor Rabx5, expanding USP8's targets and modes of action. The results provide significant contributions to our knowledge of how endosomal maturation works.

      Weaknesses:

      The rationales could have been stated clearer to help the readers.

    2. eLife assessment

      The manuscript presents a valuable model for the field of endosome maturation, providing perspective on the role of the deubiquitinating enzyme UPS-50/USP8 in the process. The evidence presented in the paper is clear, incorporating well-designed experiments that suggest the dual actions of UPS-50 and USP8 in the conversion of early endosomes into late endosomes. Overall, the work is convincing and centers on an intriguing subject.

    3. Reviewer #1 (Public Review):

      Summary:

      The manuscript focuses on the role of the deubiquitinating enzyme UPS-50/USP8 in endosome maturation. The authors aimed to clarify how this enzyme drives the conversion of early endosomes into late endosomes. Overall, they did achieve their aims in shedding light on the precise mechanisms by which UPS-50/USP8 regulates endosome maturation. The results support their conclusions that UPS-50 acts by disassociating RABX-5 from early endosomes to deactivate RAB-5 and by recruiting SAND-1/Mon1 to activate RAB-7. This work is commendable and will have a significant impact on the field. The methods and data presented here will be useful to the community in advancing our understanding of endosome maturation and identifying potential therapeutic targets for diseases related to endosomal dysfunction. It is worth noting that further investigation is required to fully understand the complexities of endosome maturation. However, the findings presented in this manuscript provide a solid foundation for future studies.

      Strengths:

      The major strengths of this work lie in the well-designed experiments used to examine the effects of UPS-50 loss. The authors employed confocal imaging to obtain a picture of the aftermath of USP-50 loss. Their findings indicated enlarged early endosomes and MVB-like structures in cells deficient in USP-50/USP8.

      Weaknesses:

      Specifically, there is a need for further investigation to accurately characterize the anomalous structures detected in the ups-50 mutant. Also, the correlation between the presence of these abnormal structures and ESCRT-0 is yet to be addressed, and the current working model needs to be revised to prevent any confusion between enlarged early endosomes and MVBs.

    4. Reviewer #2 (Public Review):

      Summary:

      In this study, the authors study how the deubiquitinase USP8 regulates endosome maturation in C. elegans and mammalian cells. The authors have isolated USP8 mutant alleles in C. elegans and used multiple in vivo reporter lines to demonstrate the impact of USP8 loss-of-function on endosome morphology and maturation. They show that in USP8 mutant cells, the early endosomes and MVB-like structures are enlarged while the late endosomes and lysosomal compartments are reduced. They elucidate that USP8 interacts with Rabx5, a guanine nucleotide exchange factor (GEF) for Rab5, and show that USP8 likely targets specific lysine residue of Rabx5 to dissociate it from early endosomes. They also find that localization of USP8 to early endosomes are disrupted in Rabx5 mutant cells. They observe that in both Rabx5 and USP8 mutant cells, the Rab7 GEF SAND-1 puncta which likely represents late endosomes are diminished, although that Rabex5 are accumulated in USP8 mutant cells. The authors provide evidence that USP8 regulates endosomal maturation in a similar fashion in mammalian cells. Based on their observations they propose that USP8 dissociates Rabex5 from early endosomes and enhances the recruitment of SAND-1 to promote endosome maturation.

      Strengths:

      The major highlights of this study include the direct visualization of endosome dynamics in a living multi-cellular organism, C. elegans. The high-quality images provide clear in vivo evidences to support the main conclusions. The authors have generated valuable resources to study mechanisms involved in endosome dynamics regulation in both the worm and mammalian cells, which would benefit many members in the cell biology community. The work identifies a fascinating link between USP8 and the Rab5 guanine nucleotide exchange factor Rabx5, which expands the targets and modes of action of USP8. The findings make a solid contribution toward the understanding of how endosomal trafficking is controlled.

      Weaknesses:

      - The authors utilized multiple fluorescent protein reporters, including those generated by themselves, to label endosomal vesicles. Although these are routine and powerful tools for studying endosomal trafficking, these results cannot tell that whether the endogenous proteins (Rab5, Rabex5, Rab7, etc.) are affected in the same fashion.<br /> - The authors clearly demonstrated a link between USP8 and Rabx5, and they showed that cells deficient of both factors displayed similar defects in late endosomes/lysosomes. But the authors didn't confirm whether and/or to which extent that USP8 regulates endosome maturation through Rabx5. Additional genetic and molecular evidence might be required to better support their working model.

    5. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The manuscript focuses on the role of the deubiquitinating enzyme UPS-50/USP8 in endosome maturation. The authors aimed to clarify how this enzyme drives the conversion of early endosomes into late endosomes. Overall, they did achieve their aims in shedding light on the precise mechanisms by which UPS-50/USP8 regulates endosome maturation. The results support their conclusions that UPS-50 acts by disassociating RABX-5 from early endosomes to deactivate RAB-5 and by recruiting SAND-1/Mon1 to activate RAB-7. This work is commendable and will have a significant impact on the field. The methods and data presented here will be useful to the community in advancing our understanding of endosome maturation and identifying potential therapeutic targets for diseases related to endosomal dysfunction. It is worth noting that further investigation is required to fully understand the complexities of endosome maturation. However, the findings presented in this manuscript provide a solid foundation for future studies.

      We thank this reviewer for the instructive suggestions and encouragement.

      Strengths:

      The major strengths of this work lie in the well-designed experiments used to examine the effects of UPS-50 loss. The authors employed confocal imaging to obtain a picture of the aftermath of the USP-50 loss. Their findings indicated enlarged early endosomes and MVB-like structures in cells deficient in USP-50/USP8.

      We thank this reviewer for the instructive suggestions and encouragement.

      Weaknesses:

      Specifically, there is a need for further investigation to accurately characterize the anomalous structures detected in the ups-50 mutant. Also, the correlation between the presence of these abnormal structures and ESCRT-0 is yet to be addressed, and the current working model needs to be revised to prevent any confusion between enlarged early endosomes and MVBs.

      Excellent suggestions. The EM imaging indeed revealed an increase in enlarged cellular vesicles containing various contents in usp-50 mutants. However, the detailed molecular features of these vesicles remain unclear. Therefore, we plan to utilize ESCRT components for double staining with early or late endosome markers. This will enable us to accurately characterize the anomalous structures detected in the usp-50 mutants.

      Reviewer #2 (Public Review):

      Summary:

      In this study, the authors study how the deubiquitinase USP8 regulates endosome maturation in C. elegans and mammalian cells. The authors have isolated USP8 mutant alleles in C. elegans and used multiple in vivo reporter lines to demonstrate the impact of USP8 loss-of-function on endosome morphology and maturation. They show that in USP8 mutant cells, the early endosomes and MVB-like structures are enlarged while the late endosomes and lysosomal compartments are reduced. They elucidate that USP8 interacts with Rabx5, a guanine nucleotide exchange factor (GEF) for Rab5, and show that USP8 likely targets specific lysine residue of Rabx5 to dissociate it from early endosomes. They also find that the localization of USP8 to early endosomes is disrupted in Rabx5 mutant cells. They observe that in both Rabx5 and USP8 mutant cells, the Rab7 GEF SAND-1 puncta which likely represents late endosomes are diminished, although Rabex5 is accumulated in USP8 mutant cells. The authors provide evidence that USP8 regulates endosomal maturation in a similar fashion in mammalian cells. Based on their observations they propose that USP8 dissociates Rabex5 from early endosomes and enhances the recruitment of SAND-1 to promote endosome maturation.

      We thank this reviewer for the instructive suggestions and encouragement.

      Strengths:

      The major highlights of this study include the direct visualization of endosome dynamics in a living multi-cellular organism, C. elegans. The high-quality images provide clear in vivo evidence to support the main conclusions. The authors have generated valuable resources to study mechanisms involved in endosome dynamics regulation in both the worm and mammalian cells, which would benefit many members of the cell biology community. The work identifies a fascinating link between USP8 and the Rab5 guanine nucleotide exchange factor Rabx5, which expands the targets and modes of action of USP8. The findings make a solid contribution toward the understanding of how endosomal trafficking is controlled.

      We thank this reviewer for the instructive suggestions and encouragement.

      Weaknesses:

      - The authors utilized multiple fluorescent protein reporters, including those generated by themselves, to label endosomal vesicles. Although these are routine and powerful tools for studying endosomal trafficking, these results cannot tell whether the endogenous proteins (Rab5, Rabex5, Rab7, etc.) are affected in the same fashion.

      Good suggestion. Indeed, to test whether the endogenous proteins (Rab5, Rabex5, Rab7, etc.) are affected in the same fashion as fluorescent protein reporters, we supplemented our approach with the utilization of endogenous markers. These markers, including Rab5, RAB-5, Rabex5, RABX-5, and EEA1 for early endosomes, as well as RAB-7, Mon1a, and Mon1b for late endosomes, were instrumental in our investigations (refer to Figure 3, Figure 6, Sup Figure 4, Sup Figure 5, and Sup Figure 7). Our comprehensive analysis, employing various methodologies such as tissue-specific fused proteins, CRISPR/Cas9 knock-in, and antibody staining, consistently highlights the critical role of USP8 in early-to-late endosome conversion.

      - The authors clearly demonstrated a link between USP8 and Rabx5, and they showed that cells deficient in both factors displayed similar defects in late endosomes/lysosomes. However, the authors didn't confirm whether and/or to which extent USP8 regulates endosome maturation through Rabx5. Additional genetic and molecular evidence might be required to better support their working model.

      Excellent point. We plan to conduct additional genetic analyses, including the construction of double mutants between usp-50 and various rabex-5 mutations, to further elucidate the extent to which USP8 regulates endosome maturation via Rabex5.

      Reviewer #3 (Public Review):

      Summary:

      The authors were trying to elucidate the role of USP8 in the endocytic pathway. Using C. elegans epithelial cells as a model, they observed that when USP8 function is lost, the cells have a decreased number and size in lysosomes. Since USP8 was already known to be a protein linked to ESCRT components, they looked into what role USP8 might play in connecting lysosomes and multivesicular bodies (MVB). They observed fewer ESCRT-associated vesicles but an increased number of "abnormal" enlarged vesicles when USP8 function was lost. At this specific point, it's not clear what the objective of the authors was. What would have been their hypothesis addressing whether the reduced lysosomal structures in USP8 (-) animals were linked to MVB formation? Then they observed that the abnormally enlarged vesicles, marked by the PI3P biosensor YFP-2xFYVE, are bigger but in the same number in USP8 (-) compared to wild-type animals, suggesting homotypic fusion. They confirmed this result by knocking down USP8 in a human cell line, and they observed enlarged vesicles marked by YFP-2xFYVE as well. At this point, there is quite an important issue. The use of YFP-2xFYVE to detect early endosomes requires the transfection of the cells, which has already been demonstrated to produce differences in the distribution, number, and size of PI3P-positive vesicles (doi.org/10.1080/15548627.2017.1341465). The enlarged vesicles marked by YFP-2xFYVE would not necessarily be due to the loss of UPS8. In any case, it appears relatively clear that USP8 localizes to early endosomes, and the authors claim that this localization is mediated by Rabex-5 (or Rabx-5). They finally propose that USP8 dissociates Rabx-5 from early endosomes facilitating endosome maturation.

      Weaknesses:

      The weaknesses of this study are, on one side, that the results are almost exclusively dependent on the overexpression of fusion proteins. While useful in the field, this strategy does not represent the optimal way to dissect a cell biology issue. On the other side, the way the authors construct the rationale for each approximation is somehow difficult to follow. Finally, the use of two models, C. elegans and a mammalian cell line, which would strengthen the observations, contributes to the difficulty in reading the manuscript.

      The findings are useful but do not clearly support the idea that USP8 mediates Rab5-Rab7 exchange and endosome maturation, In contrast, they appear to be incomplete and open new questions regarding the complexity of this process and the precise role of USP8 within it.

      We thank this reviewer for the insightful comments. Fluorescence-fused proteins serve as potent tools for visualizing subcellular organelles both in vivo and in live settings. Specifically, in epidermal cells of worms, the tissue-specific expression of these fused proteins is indispensable for studying organelle dynamics within living organisms. This approach is necessitated by the inherent limitations of endogenously tagged proteins, whose fluorescence signals are often weak and unsuitable for live imaging or genetic screening purposes. Acknowledging concerns raised by the reviewer regarding potential alterations in organelle morphology due to overexpression of certain fused proteins, we supplemented our approach with the utilization of endogenous markers. These markers, including Rab5, RAB-5, Rabex5, RABX-5, and EEA1 for early endosomes, as well as RAB-7, Mon1a, and Mon1b for late endosomes, were instrumental in our investigations (refer to Figure 3, Figure 6, Sup Figure 4, Sup Figure 5, and Sup Figure 7). Our comprehensive analysis, employing various methodologies such as tissue-specific fused proteins, CRISPR/Cas9 knock-in, and antibody staining, consistently highlights the critical role of USP8 in early-to-late endosome conversion. Specifically, we discovered that the recruitment of USP-50/USP8 to early endosomes is depending on Rabex5. However, instead of stabilizing Rabex5, the recruitment of USP-50/USP8 leads to its dissociation from endosomes, concomitantly facilitating the recruitment of the Rab7 GEF SAND-1/Mon1. In cells with loss-of-function mutations in usp-50/usp8, we observed enhanced RABX-5/Rabex5 signaling and mis-localization of SAND-1/Mon1 proteins from endosomes. Consequently, this disruption impairs endolysosomal trafficking, resulting in the accumulation of enlarged vesicles containing various intraluminal contents and rudimentary lysosomal structures.

      Through an unbiased genetic screen, verified by cultured mammalian cell studies, we observed that loss-of-function mutations in usp-50/usp8 result in diminished lysosome/late endosomes. To elucidate the underlying mechanisms, we investigated the formation of multivesicular bodies (MVBs), a process tightly linked to USP8 function. Extensive electron microscopy (EM) analysis indicated that MVB-like structures are largely intact in usp-50 mutant cells, suggesting that USP8/USP-50 likely regulate lysosome formation through alternative pathways in addition to their roles in MVB formation and ESCRT component function. USP8 is known to regulate the endocytic trafficking and stability of numerous transmembrane proteins. Interestingly, loss-of-function mutations in usp8 often lead to the enlargement of early endosomes, yet the mechanisms underlying this phenomenon remain unclear. Given that lysosomes receive and degrade materials generated by endocytic pathways, we hypothesized that the abnormally enlarged MVB-like vesicular structures observed in usp-50 or usp8 mutant cells correspond to the enlarged vesicles coated by early endosome markers. Indeed, in the absence of usp8/usp-50, the endosomal Rab5 signal is enhanced, while early endosomes are significantly enlarged. Given that Rab5 guanine nucleotide exchange factor (GEF), Rabex5, is essential for Rab5 activation, we further investigated its dynamics. Additional analyses conducted in both worm hypodermal cells and cultured mammalian cells revealed an increase of endosomal Rabex5 in response to usp8/usp-50 loss-of-function. Live imaging studies further demonstrated active recruitment of USP8 to newly formed Rab5-positive vesicles, aligning spatiotemporally with Rabex5 regulation. Through systematic exploration of putative USP-50 binding partners on early endosomes, we identified its interaction with Rabex5. Comprehensive genetics and biochemistry experiments demonstrated that USP8 acts through K323 site de-ubiquitination to dissociate Rabex5 from early endosomes and promotes the recruitment of the Rab7 GEF SAND-1/Mon1. In summary, our study began with an unbiased genetic screen and subsequent examination of established theories, leading to the formulation of our own hypothesis. Through multifaceted approaches, we unveiled a novel function of USP8 in early-to-late endosome conversion.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      This study makes an interesting finding: a polyunsaturated fatty acid, Lin-Glycine, increases the conductance of KCNQ1/KCNE1 channels by stabilizing a state of the selectivity filter that allows K+ conduction. The stabilization of a conducting state appears well supported by single-channel analysis, though some method details are missing. The linkage to PUFA action through the selectivity filter is supported by the disruption of PUFA effects by mutation of residues which change conformation in two KCNQ1 structures from the literature. Claims about differences in Lin-Glycine binding to these two structural conformations seem to lack clear support, thus the claim seems speculative that PUFAs increase Gmax by binding to a crevice in the pore domain. A potentially definitive functional experiment is conducted by single-channel recordings with selectivity filter domain mutation Y315F which ablates the Lin-Glycine effect on Gmax. However, this appears to be an n=1 experiment. Overall, the major claim of the abstract is supported: "... that the selectivity filter in KCNQ1 is normally unstable ... and that the PUFA-induced increase in Gmax is caused by a stabilization of the selectivity filter in an open-conductive state." However, the claim in the abstract that selectivity filter instability "explains the low open probability" seems too general.

      We thank the reviewer for the comments, and we would like to address the main concern regarding the single channels. We now state the number of experiments used for the single channel analysis. We agree that the claim in the abstract seems too general and we now made it more specific to our findings.

      Reviewer #2 (Public Review):

      Golluscio et al. address one of the mechanisms of IKs (KCNQ1/KCNE1) channel upregulation by polyunsaturated fatty acids (PUFA). PUFA is known to upregulate KCNQ1 and KCNQ1/KCNE1 channels by two mechanisms: one shifts the voltage dependence to the negative direction, and the other increases the maximum conductance (Gmax). While the first mechanism is known to affect the voltage sensor equilibrium by charge effect, the second mechanism is less known. By applying the single-channel recordings and mutagenesis on the putative binding sites (most of them related to the selectivity filter), they concluded that the selectivity filter is stabilized to a conductive state by PUFA binding.

      Strengths:

      They mainly used single-channel recordings and directly assessed the behavior of the selectivity filter. The method is straightforward and convincing enough to support their claims.

      Weaknesses:

      The structural model they used is the KCNQ1 channel without KCNE1 because KCNQ1/KCNE1 channel complex is not available yet. As the binding site of PUFAs might overlap with KCNE1, it is not very clear how PUFA binds to the KCNQ1 channel in the presence of KCNE1.

      Using other previous PUFA-related KCNQ1 mutants will strengthen their conclusions. For example, the Gmax of the K326E mutant is reduced by PUFA binding. Examining whether K326E shows reduced numbers of non-empty sweeps in the single-channel recordings will be a good addition.

      We thank the reviewer for the public review. We would like to address the main weak points of the comments. As a structure of KCNQ1/KCNE1 in complex is not available yet, we used KCNQ1 alone. We believe that the PUFA and KCNE1 binding sites will not overlap as we previously presented data in agreement with the idea that KCNE1 rotates the VSD relative the PD (Wu et al., 2021). This would leave enough space for both PUFA and KCNE1, so that PUFA can bind to the crevice (K326 and D301) without competing with KCNE1.  We appreciate the suggestion of adding single-channel recordings of K326E mutant and we agree it would make a valuable addition to strengthen our conclusions. However, single channel recordings for KCNQ1 are very challenging and time consuming to obtain, so we would like to keep this in consideration for future studies.

      Reviewer #3 (Public Review):

      This manuscript reveals an important mechanism of KCNQ1/IKs channel gating such that the open state of the pore is unstable and undergoes intermittent closed and open conformations. PUFA enhances the maximum open probability of IKs by binding to a crevice adjacent to the pore and stabilizing the open conformation. This mechanism is supported by convincing single-channel recordings that show empty and open channel traces and the ratio of such traces is affected by PUFA. In addition, mutations of the pore residues alter PUFA effects, convincingly supporting that PUFA alters the interactions among these pore residues.

      Strengths:

      The data are of high quality and the description is clear.

      Weaknesses:

      Some comments about the presentation.

      (1) The structural illustrations in this manuscript in general need to be more clarified.

      (2) The manuscript heavily relies on the comparison between the S4-down and S4-up structures (Figures 3, 4, and 7) to illustrate the difference between the extracellular side of the pore and to lead to the hypothesis of open-state stability being affected by PUFA. This may mislead the readers to think that the closed conformation of the channel in the up-state is the same as that in the down-state.

      We thank the reviewer for the public review, and we would like to address the comments about the presentation. We agree that the structural illustrations need to be more detailed, and we amended our previous illustrations. We have now included a new Figure 3 with a more detailed legend and a new Figure 4 that includes more information, such as the main chain of the whole selectivity filter and surrounding peptide.

      We have now added some clarification regarding the structures of KCNQ1 with S4-down and S4-up to clarify that the closed conformation of the channel in the up-state is different from that in the down-state. We also emphasize this difference in the Discussion.

      Recommendations for the authors:

      Reviewer #1:

      (1) Explain more thoroughly how the single-channel recordings were done:

      - How was Lin-Glycine applied in these experiments? The patch configuration is unclear. Was Lin-Glycine added to the patch pipette? If not, why is Lin-Glycine expected to reach the proposed binding site in the outer leaflet? Were controls time-matched applications of vehicles with ethanol?

      Data were collected using the cell attached patch configuration to minimize disruption to the patch and avoid rundown problems due to the loss of PIP2. Lin-Glycine was solubilized in DMSO and the desired concentration was added directly to the bath. We had no a priori reason to know if the PUFA would reach the proposed binding site but the consistency at which there was an increase in channel activity 5-10 minutes after addition to the bath convinced us that it was indeed reaching the binding site. This time frame fits with our prior experience with mefenamic acid effects on single channels (Wang et al 2020). The mefenamic acid binding site is external to the membrane so the drug must enter the cell and cross the patch membrane to affect channel activity. In addition, shown below is a previous recording from our lab, where nothing was added to the bath over a 55-minute time while recording consecutive files.  This shows the typical behavior of IKs, with activity tending to cluster with a few active sweeps in between many blank sweeps.  The behavior in this patch contrasts with that seen in the presence of Lin-glycine, where the clusters of activity spread over an increasing number of sweeps.

      In addition, we have previously shown that 0.1% DMSO (concentration used in the present study) does not affect the GV of KCNQ1 but there is a non-significant decrease in tail current amplitudes of about 14% (Eldstrom et al., 2021). As such we do not think that the effects we see with Lin-Glycine, with an increase in activity can be explained by vehicle effects alone.

      Author response image 1.

       

      We added some more details in the section Material and Method.

      - How well the replicates match the representative data in Figures 1, S1, and 6 is unclear (except for average current and Po in the last second of the traces from Figure 1). Are the results in Fig 6 n=1? 

      We now show in a data supplement that 3 replicates were used to access the change in channel activity upon addition of Lin-glycine.

      - Diary plots (as in Werry et al. 2013) and additional descriptions of the timeline of Lin-Glycine application and analyses could add credibility to interpretations. 

      We added a Diary plot of for the First latency to open in Supplementary Figure S1.

      - Amounts of plasmids and lipofectamine that were used in transfections are missing. 

      We added the information in Material and Method section as follow:

      “Single channel currents were recorded from transiently transfected mouse ltk- fibroblast cells (LM cells) using 1.5 mL Lipofectamine 2000 (Thermo Fisher Scientific). Cells were transfected with 1.5 mg of pcDNA3 containing a linked KCNE1-KCNQ1 construct 20, to ensure fully KCNE1-saturated complexes, in addition to a plasmid containing green fluorescent protein (GFP) to identify transfected cells”

      - Inclusion/exclusion criteria for patches analyzed are missing. 

      We added the information in Material and Method section as follow:

      “Only patches that were largely free of endogenous currents and had few channels, such that there were several blank sweeps to average for use for leak subtraction, were analyzed.”

      - Whether blinding, randomization, or pre-determined n values were employed is not mentioned. 

      No blinding, randomization or pre-determined n values were employed.

      - Analysis methods are sometimes unclear: How was Po calculated? Representative sweeps appear to have been leak and capacitance subtracted. How was that done? 

      Po was estimated from all-point amplitude histogram as follow: Po = Sum (iN/(iestimateNtotal), where N is the number of points for a specific current i in the histogram, iestimate = 0.4 pA from the peak of the histogram, and Ntotal = 10,000 is the total number of points in the last second of the trace. p = 0.75 ± 0.12 (n = 8) and p = 0.87 ± 0.04 (n = 3) for Control and Lin-Glycine, respectively.

      Leak and capacitance were subtracted with averaged empty sweeps.

      (2) The change of cells used for whole cell vs single channel (oocytes vs mouse ltk- fibroblast cells) could be discussed. These cells likely have different lipids in their membranes. Is there any other evidence that PUFAs have the same effects on KCNE1-KCNQ1 in these cells? Does the V0.5 shift? 

      A similar effect on Gmax, in both oocytes and mouse ltk-fibroblast cells, is shown in Figure 1 and 2. In Figure 2, the shift in latency suggests a shift in V0.5, suggesting the binding of PUFA to Site I.

      (3) The manuscript associates selectivity filter changes with S4 being up or down. It would help to clarify whether there was a change in [K+] in the two KCNQ1 structures used for modeling, as Mandala and MacKinnon (2023) state: "We note that one interesting difference between the two up structures regards the occupancy of K+ ions in the selectivity filter (SI Appendix, Fig. S5 C and D). In the polarized sample, due to the low extravesicular concentration of K+, density is only visible at the first and third positions in the selectivity filter, while density is present at all four positions in the unpolarized sample. Similar differences were observed in our previous study on Eag (20) and are qualitatively consistent with crystal structures of KcsA solved under symmetrical high and low K+ concentrations (45)." 

      Our studies states that there are some differences in the two structures with S4 in up-state and S4 in down-state and a reorganization of the pore. As for the change in [K+] occupancy in the two structures, we are not sure as our knowledge only come from what stated in Mandala and Mackinnon (2023). Mandala and MacKinnon did not discuss the selectivity filter in the down state structure in their paper and there are no K ions in any of their pdb files. So, we don’t know how many K+ ions there are in the down state.

      (4) The manuscript states " PUFAs increase Gmax by binding to a crevice in the pore domain" and "we elucidated that Lin-Glycine binds to a crevice between K326 and D301", this seems speculative without any actual binding studies or concrete structural evidence. A quantitative structural modeling analysis of whether changes in the crevice change the theoretical binding of Lin-Glycine might provide a stronger basis for speculation. 

      We toned down these statements in Results and Discussion to:

      “Crevice residues affect PUFA ability to increase Gmax"

      And

      Discussion: “We tested the hypothesis that the effect of Lin-Glycine involved conformational changes in the selectivity filter following PUFA binding to two residues K326 and D301 at the pore domain. Those residues delimit a small crevice that seems to change in size in different structures with S4 up or S4 down (Figure 3, D-F).”

      (5) The several figures detailing differences in selectivity filter conformation in the KCNQ1 structures are interesting and relevant in that they identify the movement of residues such as Y315 that, when mutated, ablate Lin-Glycine effect on Gmax. It would help to clarify whether T312 and I313 also move between the two selectivity filter conformations. 

      From the morph of the selectivity filter in the two conformations, it is noticeable that the changes and residue movements involve only residues at the upper part of the selectivity filter (including Y315 and D317). T312 and I313, are in the lower part of the selectivity filter and do not seem to move or rotate from their position between the two conformations of the selectivity filter.

      We now include a Supplementary Figures S3 and S4 that show the extent of movement of each residue in the pore region and a short description of this in the Results section.

      (6) The claim in the abstract that selectivity filter instability "explains the low open probability" seems too general. Lin-Glycine seems to increase the likelihood of conduction by 2.5-fold, but it was not clear whether open probability ceases to be low or whether other mechanisms also keep Po low. 

      We reword this sentence to “Our results suggest that the selectivity filter in KCNQ1 is normally unstable, contributing to the low open probability, and that the PUFA-induced increase in Gmax is caused by a stabilization of the selectivity filter in an open-conductive state..”

      Reviewer #2:

      (1) While all the electrophysiological recordings used KCNQ1/KCNE1 channels, all the structural models they used are KCNQ1 channels (without KCNE1). I know it is because the KCNQ1/KCNE1 complex structure is unavailable. However, according to their previous results, KCNQ1 alone is also upregulated by PUFAs. I am curious about what the single-channel recordings of KCNQ1 alone look like in the presence and absence of PUFAs. 

      We would love to include single-channel recordings of KCNQ1, but they are extremely hard to measure due to the small size and flickering nature of the channel.

      (2) As mentioned above, we do not have the KCNQ1/KCNE1 structure yet have the KCNQ1/KCNE3 structures (Sun and MacKinnon, Cell, 2020). According to the PDBs (6V00 or 6V01), the clevis (K326 and D301) looks covered by KCNE3. Is it true that PUFAs do not upregulate KCNQ1/KCNE3? If true, KCNE1 may not cover the clevis, so the binding mode should differ from the KCNQ1/KCNE3 structures. Please discuss the possible blocking of the clevis by KCNE proteins. 

      We previously presented data that is consistent with that KCNE1 rotates the VSD towards the PD (Wu et al., 2021). This mechanism would leave room for PUFA and KCNE1, so that PUFA can bind to the crevice (K326 and D301). So we think that this rotation will prevent PUFA and KCNE1 from competing for the same space. As for KCNQ1/KCNE3 we currently do not have any evidence about a possible upregulation by PUFA.

      (3) In the cryoEM structure with S4 resting (Figure 3F), the clevis looks too narrow for PUFA to bind. Is there any (either previous or current) evidence supporting that PUFA binding is state-dependent? 

      Because PUFAs integrate first into the bilayer and then diffuse towards its binding site on the channel, it would be hard to test a state-dependence of the binding. In addition, once PUFAs are in the bilayer, the rate of binding/unbinding is quite fast (within the ns range according to our previous MD simulations), whereas opening/closing rate is very slow (100 ms-s). So, the combination of slow wash in/washout, fast binding/unbinding, and slow opening/closing would make it very difficult to test the state-dependence of the binding by using a fast perfusion or different voltage protocols.  

      (4) In the previous report (Liin et al. Cell Reports, 2018), K326 is the most critical site for PUFA binding. Why the K326 mutants are not included in the current study? I also would like to see the single-channel recordings of the K326E mutant, which showed a smaller Gmax. Does the PUFA application reduce the probability of non-empty traces in this mutant? 

      As Liin et al. reported, mutations of K326 reduce the ability of PUFA to increase the Gmax. In this work, we wanted to gain further biophysical information on the mechanism that leads to an increase in Gmax, considering the knowledge we had from work conducted in our lab previously. We therefore focused here on residues downstream of K326 that we think are important for inducing the conformational changes at the selectivity filter. We agree that single channel experiments on K326E would be very interesting but that has to be for a future study.

      Minor points 

      (1) Liin et al. used S209F (Po of 0.4) and I204F (Po of 0.04) mutants. Their single-channel recordings would be a good addition. 

      We thank the reviewer for the suggestion. However, single channels analysis on S209F and I204F were previously shown (Eldstrom et al., 2010).

      (2) I would like to see how the Site I mutations (R2Q/Q3R) affect (or do not affect) the single-channel recordings (open probability and latency). 

      Thank you for the excellent suggestion. It would be interesting to assess the behavior of the channel when mutations occur at Site I. However, we think this information will not add any more detail to this study as we focus here our attention on the mechanism for Gmax increase. Single channels recordings are extremely hard to get, therefore we chose to include only mutations at Site II for this study.

      (3) I would like the G-V curves for all the mutations at 0 and 20 uM of Lin-Glycine (Figure 3C and Figures 5A and B). 

      We now added the G-V curves in Supplementary Figure S7.

      (4) I assume all the PUFAs have a similar effect on the selectivity filter, but a few other examples of PUFAs would be nice to see. 

      We anticipate that PUFAs and analogues with similar properties to Lin-Glycine would increasing the Gmax by a similar mechanism, because other PUFAs have been previously shown to increase the Gmax (Bohannon et al., 2020).

      (5) Although the probabilities of non-empty sweeps are written in the manuscript, bar graph presentations would be a nice addition to Figures 2 and 6. 

      We have added bar graphs of non-empty sweeps for Fig 2 and 6 in.

      (6) Is there no statistical significance for D317E and T309S in Figure 5A? 

      No statistical significance for D317E and T309S

      (7) There is no reference to Figure 7 in the manuscript. 

      A reference to Figure 7 has been added to the manuscript in the following paragraph.

      “Taken together, our results suggest that the binding of PUFA to Site II increases Gmax by promoting a series of interactions that stabilize the channel pore in the conductive state. For instance, we speculate that in the conductive state, hydrogen bonds between W304-D317 and W305-Y315, which are likely absent in the non-conductive conformation of KCNQ1, are created and that PUFA binding to Site II favors the transition towards the conductive state of the channel (Figure 7)”

      Reviewer #3:

      (1) Clarify the structural figures. Figures 3 D, E, and F - explain what the colors indicate. 

      A more detailed description of Figure 3 has been added to the legend.

      “D, E and F) Structure of crevice between S5 and S6 in KCNQ1 with S4 up (D and E) and S4 down (F). Residues that surround the crevice from S6 shown in blue (K326, T327, S330, V334) and from S5 in red (D301, A300, L303, F270). Remaining KCNQ1 residues shown in purple…, linoleic acid (LIN: gold color)”

      Fig 4. Only side chains of the residues are shown, making it hard to relate the figure to the familiar K channel selectivity filter. The main chain of the entire selectivity should be shown to orient readers to the familiar view of the K channel selectivity filter. In addition, the structures shown are only part of the selectivity filter, it should be specified which part of the selectivity filter is shown. These will also help the discussion at the bottom of page 10 and subsequent text. 

      We now provide a new Figure 4 with more details such as the main chain of the whole selectivity filter and surrounding peptide.

      (2) Cautions should be stated clearly when the structural comparison between the S4-up and S4-down is made that the structure of the pore when it is closed with S4-up may differ from the structure of the pore with S4-down. 

      We now state in addition “Clearly, there will be other differences in the pore domain between structures with activated and resting VSDs, for example the state of the activation gate.”

    2. eLife assessment

      This study reveals an important mechanism, a polyunsaturated fatty acid increases a K+ channel conductance by helping its K+ selectivity filter form a conductive state. Overall, this mechanism is supported by convincing single channel recordings, macroscopic current recordings and mutational analyses, though further clarification of some results seems warranted. These findings are expected to be of interest to researchers studying ion channel gating.

    3. Reviewer #1 (Public Review):

      This study makes an interesting finding: a polyunsaturated fatty acid, Lin-Glycine, increases the conductance of KCNQ1/KCNE1 channels by stabilizing a state of the selectivity filter that allows K+ conduction. The stabilization of a conducting state appears well supported by single channel analysis, though some technical details are missing and presentations confusing. The linkage to PUFA action through the selectivity filter is supported by disruption of PUFA effects by mutation of residues which change conformation in two KCNQ1 structures from the literature. A definitive functional experiment is conducted by single channel recordings with selectivity filter domain mutation Y315F which ablates the Lin-Glycine effect on Gmax. The computational exploration of two selectivity filter structures proposed to interact distinctly with Lin-Glycine is informative, however the relation of the closed selectivity filter structures to the [K+] concentration in which it was obtained and inactivation in other channels is ignored. Overall, the major claim of the abstract is well-supported: "... that the selectivity filter in KCNQ1 is normally unstable ... and that the PUFA-induced increase in Gmax is caused by a stabilization of the selectivity filter in an open-conductive state."

    4. Reviewer #2 (Public Review):

      Summary:

      Golluscio et al. addresses one of the mechanisms of IKs (KCNQ1/KCNE1) channel upregulation by a polyunsaturated fatty acid (PUFA). PUFAs are known to upregulate KCNQ1 and KCNQ1/KCNE1 channels by two mechanisms: one shifts the voltage dependence to the negative direction, and the other increases the maximum conductance (Gmax). While the first mechanism is known to affect the voltage sensor equilibrium by charge effect, the second mechanism is less known. By applying the single-channel recordings and mutagenesis on the putative binding sites (most of them related to the selectivity filter), they concluded that the selectivity filter is stabilized to a conductive state by PUFA binding.

      Strengths:

      The manuscript employed single-channel recordings and directly assessed the behavior of the selectivity filter. The method is straightforward and convincing enough to support the claims.

      Weaknesses:

      Although the analysis using selectivity filter mutants supports the hypothesis that PUFA binding stabilizes the conducting state of the filter, it may be somewhat speculative how PUFAs bind to the KCNQ1 channel in the presence of KCNE1.

    5. Reviewer #3 (Public Review):

      Summary:

      This manuscript reveals an important mechanism of KCNQ1/IKs channel gating such that the open state of the pore is unstable and undergoes intermittent closed and open conformations. PUFA enhances the maximum open probability of IKs by binding to a crevice adjacent to the pore and stabilize the open conformation. This mechanism is supported by convincing single channel recordings that show empty and open channel traces and the ratio of such traces is affected by PUFA. In addition, mutations of the pore residues alter PUFA effects, convincingly supporting that PUFA alters the interactions among these pore residues.

      Strengths:

      The data are of high quality and the description is clear.

    1. Reviewer #1 (Public Review):

      Summary:

      The current manuscript provides strong evidence that the molecular function of SLC35G1, an orphan human SLC transporter, is citrate export at the basolateral membrane of intestinal epithelial cells. Multiple lines of evidence, including radioactive transport experiments, immunohistochemical staining, gene expression analysis, and siRNA knockdown are combined to deduce a model of the physiological role of this transporter.

      Strengths:

      The experimental approaches are comprehensive, and together establish a strong model for the role of SLC35G1 in citrate uptake. The observation that chloride inhibits uptake suggests an interesting mechanism that exploits the difference in chloride concentration across the basolateral membrane.

      Weaknesses:

      Some aspects of the results would benefit from a more thorough discussion of the conclusions and/or model.

      For example, the authors find that SLC35G1 prefers the dianionic (singly protonated) form of citrate, and rationalize this finding by comparison with the substrate selectivity of the citrate importer NaDC1. However, this comparison has weaknesses when considering the physiological pH for SLC35G1 and NaDC1. NaDC1 binds citrate at a pH of ~5.4 (the pKa of citrate is 5.4, so there is a lot of dianionic citrate present under physiological circumstances). SLC35G1 binds citrate under pH conditions of ~7.5, where a very small amount of dianionic citrate is present. The data clearly show a pH dependence of transport, and the authors rule out proton coupling, but the discrepancy between the pH dependence and the physiological expectations should be addressed/commented on.

      The rationale for the series of compounds tested in Figure 1F, which includes metabolites with carboxylate groups, a selection of drugs including anion channel inhibitors and statins, and bile acids, is not described. Moreover, the lessons drawn from this experiment are vague and should be expanded upon. It is not clear what, if anything, the compounds that reduce citrate uptake have in common.

      The transporter is described as a facilitative transporter, but this is not established definitively. For example, another possibility could involve coupling citrate transport to another substrate, possibly even chloride ion.

    2. Reviewer #2 (Public Review):

      Summary:

      The primary goal of this study was to identify the transport pathway that is responsible for the release of dietary citrate from enterocytes into blood across the basolateral membrane.

      Strengths:

      The transport pathway responsible for the entry of dietary citrate into enterocytes was already known, but the transporter responsible for the second step remained unidentified. The studies presented in this manuscript identify SLC35G1 as the most likely transporter that mediates the release of absorbed citrate from intestinal cells into the serosal side. This fills an important gap in our current knowledge of the transcellular absorption of dietary citrate. The exclusive localization of the transporter in the basolateral membrane of human intestinal cells and the human intestinal cell line Caco-2 and the inhibition of the transporter function by chloride support this conclusion.

      Weaknesses:

      (i) The substrate specificity experiments have been done with relatively low concentrations of potential competing substrates, considering the relatively low affinity of the transporter for citrate. Given that NaDC1 brings in not only citrate as a divalent anion but also other divalent anions such as succinate, it is possible that SLC35G1 is responsible for the release of not only citrate but also other dicarboxylates. But the substrate specificity studies show that the dicarboxylates tested did not compete with citrate, meaning that SLc35G1 is selective for the citrate (2-), but this conclusion might be flawed because of the low concentration of the competing substrates used in the experiment.

      (ii) The authors have used MDCK cells for assessment of the transcellular transfer of citrate via SLC35G1, but it is not clear whether this cell line expresses NaDC1 in the apical membrane as the enterocytes do. Even though the authors expressed SLC35G1 ectopically in MDCK cells and showed that the transporter localizes to the basolateral membrane, the question as to how citrate actually enters the apical membrane for SLC35G1 in the other membrane to work remains unanswered.

      (iii) There is one other transporter that has already been identified for the efflux of citrate in some cell types in the literature (SLC62A1, PLoS Genetics; 10.1371/journal.pgen.1008884), but no mention of this transporter has been made in the current manuscript.

    3. Reviewer #3 (Public Review):

      Summary:

      Mimura et al describe the discovery of the orphan transporter SLC35G1 as a citrate transporter in the small intestine. Using a combination of cellular transport assays, they show that SLC35G1 can mediate citrate transport in small intestinal cell lines. Furthermore, they investigate its expression and localization in both human tissue and cell lines. Limited evidence exists to date on both SLC35G1 and citrate uptake in the small intestine, therefore this study is an important contribution to both fields. However, the main claims by the authors are only partially supported by experimental evidence.

      Strengths:

      The authors convincingly show that SLC35G1 mediates uptake of citrate which is dependent on pH and chloride concentration. Putting their initial findings in a physiological context, they present human tissue expression data of SLC35G. Their Transwell assay indicates that SLC35G1 is a citrate exporter at the basolateral membrane.

      Weaknesses:

      Further confirmation and clarification are required to claim that the SLC indeed exports citrate at the basolateral membrane as concluded by the authors. Most experiments measure citrate uptake, but the authors state that SLC35G1 is an exporter, mostly based on the lack of uptake at physiological conditions faced at the basolateral side. The Transwell assay in Figure 1L is the only evidence that it indeed is an exporter. However, in this experiment, the applied chloride concentration was not according to the proposed model (120mM at the basolateral side). The Transwell assay, or a similar assay measuring export instead of import, should be carried out in knockdown cells to prove that the export indeed occurs through SLC35G1 and not through an indirect effect. Related to the mentioned chloride sensitivity, it is unclear how the proposed model works if the SLC faces high chloride conditions under physiological conditions though it is inhibited by chloride.

    1. eLife assessment

      In this useful study, the authors investigate the regulatory mechanisms related to toxin production and pathogenicity in Aspergillus flavus. Their observations indicate that the SntB protein regulates morphogenesis, aflatoxin biosynthesis, and the oxidative stress response. In general, the data supporting the conclusions are solid but could be strengthened further through additional analyses of CHIP-seq data.

    2. Reviewer #1 (Public Review):

      The study identifies the epigenetic reader SntB as a crucial transcriptional regulator of growth, development, and secondary metabolite synthesis in Aspergillus flavus, although the precise molecular mechanisms remain elusive. Using homologous recombination, researchers constructed sntB gene deletion (ΔsntB), complementary (Com-sntB), and HA tag-fused sntB (sntB-HA) strains. Results indicated that deletion of the sntB gene impaired mycelial growth, conidial production, sclerotia formation, aflatoxin synthesis, and host colonization compared to the wild type (WT). The defects in the ΔsntB strain were reversible in the Com-sntB strain.

      Further experiments involving ChIP-seq and RNA-seq analyses of sntB-HA and WT, as well as ΔsntB and WT strains, highlighted SntB's significant role in the oxidative stress response. Analysis of the catalase-encoding catC gene, which was upregulated in the ΔsntB strain, and a secretory lipase gene, which was downregulated, underpinned the functional disruptions observed. Under oxidative stress induced by menadione sodium bisulfite (MSB), the deletion of sntB reduced catC expression significantly. Additionally, deleting the catC gene curtailed mycelial growth, conidial production, and sclerotia formation, but elevated reactive oxygen species (ROS) levels and aflatoxin production. The ΔcatC strain also showed reduced susceptibility to MSB and decreased aflatoxin production compared to the WT.

      This study outlines a pathway by which SntB regulates fungal morphogenesis, mycotoxin synthesis, and virulence through a sequence of H3K36me3 modification to peroxisomes and lipid hydrolysis, impacting fungal virulence and mycotoxin biosynthesis.

      The authors have achieved majority of their aims at the beginning of the study, finding target genes, which led to catC mediated regulation of development, growth and aflatoxin metabolism. Overall most parts of the study is solid and clear.

    3. Reviewer #2 (Public Review):

      Summary:

      This work is of great significance in revealing the regulatory mechanisms of pathogenic fungi in toxin production, pathogenicity, and in its prevention and pollution control. Overall, this is generally an excellent manuscript.

      Strengths:

      The data in this manuscript is robust and the experiments conducted are appropriate.

      Weaknesses:

      (1) The authors found that SntB played key roles in oxidative stress response of A. flavus by ChIP-seq and RNA sequencing. To confirm the role of SntB in oxidative stress, authors have better to measure the ROS levels in the ΔsntB and WT strains, besides the ΔcatC strain.<br /> (2) Why the authors only studied the function of catC among the 7 genes related to oxidative response listed in Table S14.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      The manuscript by Wu et al. explores the role of the histone reader protein SntB in Aspergillus flavus, claiming it to be a key regulator of development and aflatoxin biosynthesis. While the study incorporates various techniques, including gene deletion, ChIP-seq, and RNA-seq, several concerns and omissions in the paper raise questions about the validity and completeness of the presented findings.

      (1) Omissions of Prior Work:

      The authors fail to acknowledge and integrate prior research by Pfannenstiel et al. (2018) on the sntB gene in A. flavus, which covered phenotypic changes, RNA-seq data, and histone modifications. This omission raises concerns about the transparency and completeness of the current study.

      The absence of reference to studies by Karahoda et al. (2022, 2023) revealing SntB's involvement in the KERS complex in A. flavus and A. nidulans is a major oversight. This raises questions about the specificity of SntB's regulatory functions, as it may be part of a larger complex. The authors should clarify why these studies were omitted and how they ensure that SntB alone, and not the entire KERS complex, is responsible for the observed effects.

      We very appreciate reviewer’s professional question. As reviewer mentioned, Pfannenstiel et al. (2018) reported the functions of sntB gene covered secondary metabolism, development and global histone modifications in A. flavus and we also cited this paper (please see reference 20). In their study, the functions of sntB gene were analyzed by both Δ_sntB_ and overexpression sntB genetic mutants. SntB deletion impaired several developmental processes, such as sclerotia formation and heterokaryon compatibility, secondary metabolite synthesis, and the ability to colonize host seeds, which were consistent with our results (Figure 1 and 2). Unlike, a complementation strain was constructed in our study which further clarified and confirmed the function of sntB gene. What’s more, our main purpose is to find the downstream regulatory mechanism of SNTB, which was reported to be a transcription factor, not only as an important epigenetic reader. Please see lane 452-457 and lane 486-500.

      For the function of KERS complex in A. nidulans (Karahoda et al., 2022), we had cited the papers, please see reference 29. For the report about the function of KERS complex in A. flavus (Karahoda et al., 2023), this paper was published recently. We are sorry for the omissions of this work. In our revised manuscript, we have cited this paper and compared with our work. Please see lane 97-98 and reference 30. Based solely on our experiments, we cannot confirm whether it is acting alone or in conjunction with others, what we can confirm is that SntB plays a key role in the process. And we will conduct related research in the future.

      (2) Transparency and Accessibility of Data:

      The lack of accessibility and visualization tools for ChIP-seq and RNA-seq data poses a challenge for independent verification and in-depth analysis. The authors should address this issue by providing more accessible data or explaining the limitations of data availability. A critical component missing from the paper is a detailed presentation of ChIP-seq data, specifically demonstrating SntB binding patterns on key promoters. This omission weakens the link between SntB and the mentioned regulatory genes. The authors should include these crucial data visualizations to strengthen their claims.

      To review GEO accession GSE247683, you can go to https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE247683, and enter the token “ipilouscnruprsl” into the box. And after our paper being published, the data will be released. For the SntB binding patterns on key promoters, we have added in the Figure 4, please see Figure 4D, 4E, 5F, 5G, and table S9.

      (3) SntB Binding Sites and Consensus Sequence:

      The study mentions several genes upregulated in the sntB mutant without demonstrating SntB binding sites on their promoters. A detailed analysis of SntB binding maps is necessary to establish a direct link between SntB and these regulatory genes.

      Thanks for your suggestion. We have added the binding maps of SntB, please see Figure 5F, 5G; lane 362-364.

      (4) Mechanistic Insight into Peroxisome Biogenesis:

      If SntB indeed regulates peroxisome biogenesis, the absence of markers for peroxisomes and the localization of peroxisomes in the sntB mutant vs. WT strains is a significant gap. Providing evidence for peroxisome regulation is crucial for understanding the proposed mechanism and validating the study's claims.

      Thanks for your suggestion. Catalase is ubiquitously present in aerobic organisms and plays a crucial role in mitigating oxidative stress through the scavenging of reactive oxygen species (ROS). So, we detected the ROS level in sntB mutant and WT strain, as well as ∆catC strain (Figure 6H).

      In summary, while the manuscript presents intriguing findings regarding SntB's role in A. flavus, the omissions of prior work, lack of transparency in data accessibility, and insufficient mechanistic insights call for revisions and additional experimental evidence to strengthen the validity and impact of the study. Addressing these concerns will enhance the manuscript's contribution to the field.

      Thanks. We have revised our manuscript depending on the valuable comments provided above.

      Additionally, the way the English language is used could be improved.

      Thanks. We have asked a native English-writing assistant to proof read the paper and revised the grammar errors and typos and improve the readability and quality of the manuscript.

      Reviewer #2 (Public Review):

      Summary:

      This work is of great significance in revealing the regulatory mechanisms of pathogenic fungi in toxin production, pathogenicity, and in its prevention and pollution control. Overall, this is generally an excellent manuscript.

      Strengths:

      The data in this manuscript is robust and the experiments conducted are appropriate.

      Weaknesses:

      (1) The authors found that SntB played key roles in the oxidative stress response of A. flavus by ChIP-seq and RNA sequencing. To confirm the role of SntB in oxidative stress, the authors have to better measure the ROS levels in the ΔsntB and WT strains, besides the ΔcatC strain.

      Thanks for your suggestion. We have supplemented the relevant experiments and the results were shown in Figure 6G and lane 185-192 and 395-398.

      (2) Why did the authors only study the function of catC among the 7 genes related to an oxidative response listed in Table S14?

      The function of some genes in Table S15 (Table S14 in old version of our manuscript) had been studied, such as cat1 [1]. In this study, we just choose catC for further validation, which was the most up-regulated gene in Δ_sntB_ strain. The others may also have important roles in SntB triggered antioxidant pathways to regulate development and aflatoxin biosynthesis in A. flavus. We will focus on this in the following work.

      (1) Zhu Z., Yang M., Bai Y., Ge F., Wang S. Antioxidant-related catalase CTA1 regulates development, aflatoxin biosynthesis, and virulence in pathogenic fungus Aspergillus flavus [J]. Environ Microbiol, 2020, 22(7): 2792-2810.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Line 52: Change "shad light" to "shed light"

      Thanks. We have revised. Please see lane 50.

      Line 62: Change "has" to "have" to match the plural noun "aflatoxins."

      Original: "Aflatoxins produced by A. flavus has strong toxicity..."

      Suggested: "Aflatoxins produced by A. flavus have strong toxicity..."

      Thanks. We have revised it. Please see lane 62.

      Line 79: Consider rephrasing for clarity.

      Original: "...which may result in the modulation of the expression of genes involved in toxin production [15-17]."

      Thanks. We have revised. Please see lane 77-80.

      Line 105: Add a comma after "host strain."

      Original: "A. flavus Δku70 ΔpyrG was used as a host strain for genetic manipulations."

      Suggested: "A. flavus Δku70 ΔpyrG was used as a host strain, for genetic manipulations."

      Thanks. We have revised it. Please see lane 107.

      Line 113, Table 1: Remove the extra "r" in "from" in the Source column.

      Original: "Kindly presented form Prof. Chang[1]"

      Suggested: "Kindly presented from Prof. Chang[1]"

      Thanks. We have revised it. Please see Table 1.

      Line 140: Typo - Change "reaches" to "reach."

      Original: "when silkworm larva reaches about 1 g in weight."

      Suggested: "when silkworm larvae reach about 1 g in weight."

      Thanks. We have revised it. Please see lane 141.

      Line 158: Typo - Change "pervious" to "previous."

      Original: "Data processing was according pervious study [39]."

      Suggested: "Data processing was according to a previous study [39]."

      Thanks. We have revised it. Please see lane 150.

      Line 138 The animal invasion assay using silkworms was conducted according to a previous study.

      Change "according" to "conducted according to" for clarity.

      Thanks. We have revised it. Please see lane 139.

      Line 148 Was carried out by APPLIED PROTEIN TECHNOLOGY, Shanghai (www. aptbiotech.com).

      Change "TECHNOLOY" corrected to "TECHNOLOGY."

      Thanks. We have revised it. Please see lane 149.

      Line 148 Data processing was conducted according to a previous study [39].

      Change "according to" to "conducted according to" for clarity.

      Thanks. We have revised it. Please see lane 139.

      Line 429 Schizzosaccharomyces pombe, Correct the spelling to "Schizosaccharomyces pombe [55]."

      Thanks. We have revised it. Please see lane 448.

      Reviewer #2 (Recommendations For The Authors):

      (1) The resolution of the words written in Figures 3 and 4 is not clear (or high) enough.

      Thanks. We have revised them. Please see Figures 3 and 4.

      (2) Which kind of protein marker (protein ladder) was used in Figure 4A, you should mark out the size of the related protein.

      Thanks. We have revised. Please see Figure 4A and lane 332-333.

      (3) Latin names do not necessarily need to be written in full when they are not the first time used in the text.

      Thanks. We have revised them throughout the manuscript.

      (4) The complementary strain of sntB was labeled as sntB-C in Figure 2B, while in other figures was Com-sntB. You should correct all related problems.

      Thanks. We have revised it. Please see Figure 2B.

      (5) What is the meaning of "1" in Table 1?

      Thanks. The meaning of "1" in Table 1 was a citation. We have revised. Please see Table 1.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      The manuscript constitutes an important contribution to antimalarial drug discovery, employing diverse systems biology methodologies; with a focus on an improved M1 metalloprotease inhibitor, the study provides convincing evidence of the utility of chemoproteomics in elucidating the preferential targeting of PfA-M1. Additionally, metabolomic analysis effectively documents specific alterations in the final steps of hemoglobin breakdown. These findings underscore the potential of the developed methodology, not only in understanding PfA-M1 targeting but also in its broader applicability to diverse malarial proteins or pathways. Revisions are needed to further enhance overall clarity and detail the scope of these implications.

      We thank the editor and reviewers for recognising the contribution our work makes to understanding the selective targeting of aminopeptidase inhibitors in malaria parasites and the wider impact this multi-omic strategy can have for anti-parasitic drug discovery efforts. The reviewers have provided constructive feedback and raised important points that we have taken on-board to improve our manuscript. In particular, we have revised aspects of the text and figures to enhance clarity, performed additional analysis on the other possible MIPS2673 interacting proteins and more comprehensively analysed the effect of MIPS2673 on parasite morphology. NB: Specific responses to comments in the public reviews are provided within responses to the specific recommendations to authors.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The article "Chemoproteomics validates selective targeting of Plasmodium M1 alanyl aminopeptidase as a cross-species strategy to treat malaria" presents a series of biochemical methods based on proteomics and metabolomics, as a means to:

      (1) validate the specific targeting of biologically active molecules (MIPS2673) towards a defined (unique) protein target within a parasite and (2) to explore whether by quantifying the perturbations generated at the level of the parasite metabolome, it is possible to extrapolate which metabolic pathway has been disrupted by using this biologically active molecule and whether this may further confirm selective targeting in parasites of the expected (or in-vitro targeted) enzyme (here PfA-1).

      The inhibitor used in this work by the authors (MIPS2673) is to my knowledge a novel one, although belonging to a chemical series previously explored by the authors, which recently enabled them to discover a specific PfA-M17 inhibitor, MIPS2571 (Edgard et al., 2022, ref 11 of this current work). Indeed, inhibitors specifically targeting either PfA-M1 or PfA-M17 (and not both, as currently done in the past) are scarce today, and highly needed to functionally characterize these two zinc-aminopeptidases. MIPS2673, blocks the development of erythrocytic stages of Plasmodium falciparum with an EC50 of 324 nM, blocks the parasite development at the young trophozoite stage at 5x EC50 (but at ring stages at 10xEC50, figure 1E), and inhibits the enzymatic activity of PfA-M1 (and its ortholog Pv-M1) but not of the related malarial metallo-aminopeptidases (M17 and M18 families) nor the human metalloenzymes from closely related enzymatic families, supporting its selective targeting of PfA-M1 (and Pv-M1).

      All experiments are carried out in vitro (e.g. biochemical studies such as enzymology, proteomics, metabolomics) and on cultured parasites (erythrocyte stages of Plasmodium falciparum and several gametocytes stages obtained in vitro); there are no in vivo manipulations. The work related to Plasmodium vivax, which justifies the "cross-species" indication in the title of the article, is restricted to using a recombinant form of the M1-family aminopeptidase in enzymatic assays. The rest of the work concerns only Plasmodium falciparum. While I found globally that this work is original and brings new data and above all proposes chemical validation approaches that could be used for other target validations under similar limiting conditions (impossibility of KO of the gene), I have some specific questions to address to the authors.

      Strengths and weaknesses:

      - The chemoproteomic approach, that explores the ability of MIPS2673 to more significantly "protect" the putative target (PfA-M1) against thermal degradation or enzymatic attack (by proteinase K), to document its selective targeting towards PfA-M1 (the inhibitor, once associated with its target, is expected to stabilize its structure or prevent the action of end proteases), uses several concentrations of MIPS2673 and provides convincing results. My main criticism is that these tests are carried out with parasite extracts enriched in 30-38 hours old forms, and restricted to the fraction of soluble proteins isolated from these parasitic forms, which still limits the scope of the analysis. It is clear that this methodological approach is a choice that can be argued both biologically (PfA-M1 is well expressed in these stages of the parasite development) and biochemically (it is difficult to do proteomic analyses on insoluble proteins) but I regret that the authors do not discuss these limitations further, notably, I would have expected (from Figure 1E) some targets to be also present at ring stages.

      - The metabolomic approach, by documenting the ability of MIPS2673 to selectively increase the number of non-hydrolyzed dipeptides in treated versus untreated parasites is another argument in favor of the selective targeting of PfA-M1 by MIPS2673, in particular by its broad-spectrum aminopeptidase action preferentially targeting peptides resulting from the degradation of hemoglobin by the parasite. The relative contribution of peptides derived from host hemoglobin versus other parasite proteins is, however, little discussed.

      The work as a whole remains highly interesting, both for the specific topic of PfA-M1's role in parasite biology and for the method, applicable to other malarial drug contexts.

      Reviewer #2 (Public Review):

      In this manuscript, the authors first developed a new small molecular inhibitor that could target specifically the M1 metalloproteases of both important malaria parasite species Plasmodium falciparum and P. vivax. This was done by a chemical modification of a previously developed molecule that targets PfM1 as well as PfM17 and possibly other Plasmodial metalloproteases. After the successful chemical synthesis, the authors showed that the derived inhibitor, named MIPS2673, has a strong antiparasitic activity with IC50 342 nM and it is highly specific for M1. With this in mind, the authors first carried out two large-scale proteomics to confirm the MIPS2673 interaction with PfM1 in the context of the total P. falciparum protein lysate. This was done first by using thermal shift profiling and subsequently limited proteolysis. While the first demonstrated overall interaction, the latter (limited proteolysis) could map more specifically the site of MIPS2673-PfM1 interaction, presumably the active site. Subsequent metabolomics analysis showed that MIPS2673 cytotoxic inhibitory effect leads to the accumulation of short peptides many of which originate from hemoglobin. Based on that the authors argue that the MIPS2673 mode of action (MOA) involves inhibition of hemoglobin digestion that in turn inhibits the parasite growth and development.

      Reviewer #3 (Public Review):

      This is a manuscript that attempts to validate Plasmodium M1 alanyl aminopeptidase as a target for antimalarial drug development. The authors provide evidence that MIPS2673 inhibits recombinant enzymes from both Pf and Pv and is selective over other proteases. There is in vitro antimalarial activity. Chemoproteomic experiments demonstrate selective targeting of the PfA-M1 protease.

      This is a continuation of previous work focused on designing inhibitors for aminopeptidases by a subset of these authors. Medicinal chemistry explorations resulted in the synthesis of MIPS2673 which has improved properties including potent inhibition of PfA-M1 and PvA-M1 with selectivity over a closed related peptidase. The compound also demonstrated selectivity over several human aminopeptidases and was not toxic to HEK293 cells at 40 uM. The activity against P. falciparum blood-stage parasites was about 300 nM.

      Thermal stability studies confirmed that PfA-M1 was a binding target, however, there were other proteins consistently identified in the thermal stability studies. This raises the question as to their potential role as additional targets of this inhibitor. The authors dismiss these because they are not metalloproteases, but further analysis is warranted. This is particularly important as the authors were not able to generate mutants using in vitro evolution of resistance strategies. This often indicates that the inhibitor has more than one target.

      The next set of experiments focused on a limited proteolysis approach. Again several proteins were identified as interacting with MIPS2673 including metalloproteases. The authors go on to analyze the LiP-MS data to identify the peptide from PfA-M1 which putatively interacts with MIPS2673. The authors are clearly focused on PfA-M1 as the target, but a further analysis of the other proteins identified by this method would be warranted and would provide evidence to either support or refute the authors' conclusions.

      The final set of experiments was an untargeted metabolomics analysis. They identified 97 peptides as significantly dysregulated after MIPS2673 treatment of infected cells and most of these peptides were derived from one of the hemoglobin chains. The accumulation of peptides was consistent with a block in hemoglobin digestion. This experiment does reveal a potential functional confirmation, but questions remain as to specificity.

      Overall, this is an interesting series of experiments that have identified a putative inhibitor of PfA-M1 and PvA-M1. The work would be significantly strengthened by structure-aided analysis. It is unclear why putative binding sites cannot be analyzed via specific mutagenesis of the recombinant enzyme.

      In the thermal stability and LiP -MS analysis, other proteins were consistently identified in addition to PfA-M1 and yet no additional analysis was undertaken to explore these as potential targets.

      The metabolomics experiments were potentially interesting, but without significant additional work including different lengths of treatment and different stages of the parasite, the conclusions drawn are overstated. Many treatments disrupt hemoglobin digestion - either directly or indirectly and from the data presented here it is premature to conclude that treatment with MIPS2673 directly inhibits hemoglobin digestion.

      Finally, the potency of this compound on parasites grown in vitro is 300 nM - this would need improvements in potency and demonstration of in vivo efficacy in the SCID mouse model to consider this a candidate for a drug.

      Summary:

      Overall, this is an interesting series of experiments that have identified a putative inhibitor of the Plasmodium M1 alanyl aminopeptidases, PfA-M1 and PvA-M1.

      Strengths:

      The main strengths include the synthesis of MIPS2673 which is selectively active against the enzymes and in whole-cell assay.

      Weaknesses:

      The weaknesses include the lack of additional analysis of additional targets identified in the chemoproteomic approaches.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Question 1. Line 737 (and elsewhere). Why are Plasmodium vivax orthologs of PfA-M1 and PfA-M17 called Pv-M1 and Pv-M17 and not PvA-M1 and PvA-M17, where A stands for Aminopeptidase? I would recommend changing the names if possible, although the mention of Pv-M1 and Pv-M17 is now current in the literature (which is kind of regrettable). See also Supplemental Table S1 where PfA-M1 is named Pf-M1.

      Supplemental Table S1 was updated to PfA-M1. Nomenclature for the Plasmodium vivax aminopeptidase orthologs was amended to PvA-M1 and PvA-M17 as suggested by the reviewer.

      Question 2. Figure 1. Observation of parasite culture slide smears in Figure 1E strongly suggests that an important target of MIPS2673 appears to be expressed at the ring stage or very young trophozoites, whereas the authors, in their proteomic and metabolomic analyses, performed studies focused on late trophozoites stages (30-38h post-invasion). This difference in the targeting of Plasmodium stages puzzles me and deserves some explanations from the authors, and is related to my question 3.

      As the reviewer indicates, ring-stage parasite growth appears to be affected at high concentrations (5x and 10x EC50) of MIPS2673. Under these conditions, parasite growth appears to stall during late rings/early trophs at ~16-22 h post invasion when haemoglobin digestion is increasing and when one presumes PfA-M1 (the primary target of MIPS2673) is increasing in both expression and activity (see references 26 and 28 of this manuscript). Thus, whilst it is unsurprising that MIPS2673 has some activity against ring-stage parasites, we focused on the trophozoite stage for our proteomics studies as we showed this to be the stage most susceptible to MIPS2673 (Fig. 1D) and reasoned that we would most likely identify the primary MIPS2673 target, and other interacting proteins, from a complex biological mixture at this stage. The same reasoning underpinned our decision to perform metabolomics on drug-treated trophozoites, as we reasoned we would see a greater functional effect on this stage. Furthermore, performing these experiments on trophozoites rather than rings minimises the interference from the host red blood cell. While we cannot rule out additional targets in rings, repeating all experiments during this parasite stage is beyond the scope of this study.

      Question 3. Figure 2. Although Figure 2 is insightful and somehow self-explanatory, I think it misses two specific pieces of information. First, it is indicated in line 618 (M&M) that parasite material for thermal stability and limited proteolysis studies correspond to synchronized parasites (30-38h post-invasion) but this information is not given in Figure 2. In addition, if I fully understand the experimental protocol of obtaining parasite extracts, they strictly correspond to the soluble protein fraction of the erythrocytic stages of plasmodium at the late trophozoite stage, and not to all parasitic proteins as the scheme of Figure 2 might suggest. I would appreciate it very much if these two points (parasite stages and soluble proteins) were clearly indicated in the scheme as indeed, not the whole parasite blood stage proteome is investigated in the study but just a part of it (~47%, as the authors indeed indicate line 406). Please, edit also the legend of the figure accordingly.

      This is correct, the soluble protein fraction from synchronised trophozoites was used in our proteomics studies. These details have been included in an updated Figure 2 and in the corresponding figure legend.

      Question 4. Thermal stabilization. Figure 3B. Could the authors explain how they calculated or measured "absolute" protein abundances, and how this refers to a number of parasites in initial assays as this is not clear to me. Notably, abundance for PfA-M1 is much higher than for PF3D7_0604300, which are interesting "absolute" values.

      Protein abundance was calculated using the mean peptide quantity of the stripped peptide sequence, with only precursors passing the Q-value threshold (0.01) considered for relative quantification. Within independent experiments, normalisation was based on total protein amount (determined by the BCA assay) rather than the initial number of parasites.

      PfA-M1 is known to be a highly abundant protein and PF3D7_0604300 (as well as the other protein hits identified by thermal stability proteomics) are likely less abundant. It is noted that abundance is also dependent on ionisation efficiency and trypsin digestion efficiency. Therefore, we avoid comparing absolute abundances across proteins and use relative differences across conditions instead.

      NB: the word “absolute” in the text (“absolute fold-change”) refers to the absolute value of the fold-change (i.e. positive or negative), and not to absolute quantification of proteins. The preceding text in each case clarifies that these are based on “relative peptide abundance”.

      Question 5. Figure 5A. How do the authors explain peptides whose abundances are decreasing instead of increasing? Figure 5C. Could the authors provide digital cues (aa numbers or positions) on the ribbon representation of the PfA-M1 sequence? It is difficult to correlate the position of the 3D domains with respect to the primary structure of the protein. Also, the "yellow" supposed to show the "drug ligand" is really not very visible.

      LiP-MS is based on the principle that ligand binding alters the local proteolytic susceptibility of a protein to a non-specific protease (in this case proteinase K, PK). In this sense, in LiP-MS we are not looking at variations in the stability of whole proteins (as is the case with thermal stability proteomics, where proteins detected with significantly higher abundance in treated relative to control samples reflects thermal stabilisation of the target due to ligand binding), but differences in peptide patterns between treated and control samples that reflect a change in the ability of PK to cleave the target. Thus, in the bound state, the ligand prevents proteolysis with PK. This results in decreased abundance of peptides with non-tryptic ends (as PK cannot access the region around where the ligand is bound) and increased abundance of the corresponding fully tryptic peptide, when compared to the free target. This concept is demonstrated in Fig. 4A and is explained in the text (lines 279-282) and Fig. 4 figure legend.

      To aid visualisation, we have not added amino acid positions on the PfA-M1 sequence in Fig. 5, but have provided amino acid positions for all peptides in Supplementary File 3. We have also changed the colour of the ligand in Fig. 5C to blue and increased transparency of the binding and centre of mass neighbourhoods.

      Question 6. Gametocyte assays. Line 824 states that several compounds were used as positive controls for anti-gametocyte activity (chloroquine, artesunate, pyronaridine, pyrimethamine, dihydroartemisinin, and methylene blue) and line 821 states that the biological effects are measured against puromycin. This is not very clear to me, could the authors comment on this?

      This wording has been clarified in the methods to reflect that 5 µM puromycin was used as the positive control to calculate percent viability, whereas the other antimalarials were run in parallel as reference compounds with known anti-gametocyte activity (line 862).

      Question 7. Metabolomics. Metabolomic assays were done on parasites at 28h pi, incubated for 1h with 3x EC50 of MIPS2673. You mention applying the drug on 2x10E8 infected red blood cells (line 838) but you do not explain how you isolate these infected red blood cells from non-infected red blood cells. Could you please specify this?

      Metabolomics studies were performed such that cultures at 2% haematocrit and 6% trophozoite-stage parasitaemia (representing 2 x 108 cells in total, rather than 2 x 108 infected cells) were treated with compound or vehicle and after 1 h metabolites were extracted. This methodological detail has been clarified in the methods (line 875).

      Question 8. Figure 3B. Does this diagram come from the experimental 3D structure created by the authors (8SLO) or from molecular modeling? Please specify in the legend (line 1305).

      The diagram showing the binding mode of MIPS2673 bound to PfA-M1 comes from the experimentally determined 3D structure (PDB ID: 8SLO). This has now been stated in the figure legend. Note that the structural diagram refers to Fig. 1B (not Fig. 3B as indicated by the reviewer). The experimentally determined PfA-M1 structure with MIPS2673 bound (PDB ID: 8SLO) was also used to map LiP peptides and estimate the MIPS2673 binding site in Fig. 5, which is also now reflected in the appropriate section of the text (line 308) and Fig. 5 legend.

      Question 9. Line 745. Why not indicate µm concentration for this H-Leu-NHMec substrate while it is indicated for the other substrates mentioned in the rest of the paragraph (H-Ala-NHMec, 20 μM, etc..). Also in this section (Enzyme assays) the pH at which the various enzymatic assays were done is missing.

      All enzyme assays were performed at pH 8.0. The concentration of H-Leu-NHMec varied depending on the enzyme assayed, as follows: 20 µM for PfA-M1, 40 µM for PvA-M1 and 100 µM for ERAP1 and ERAP2. This information is now clearly stated in the methods section (lines 782 and 787) and as a footnote for Supplemental Table S1.

      Question 10. Line 830, please define FBS.

      Fetal bovine serum (FBS) has been added where appropriate (line 867).

      Question 11. The authors mention in the title the targeting of several plasmodium species, but the only experimental study on the Plasmodium vivax species concerns the use of the recombinant enzyme Pv-M1. Authors also mention "multi-stage targets", but ultimately only look at erythrocyte stages and three different gametocyte stages.

      We have now removed the words “cross-species” and “multi-stage” from the manuscript title and abstract so as not to overstate these findings. We have also added the word “potential” in the manuscript text to clarify that selective M1 inhibition could offer a potential multistage and cross species strategy for malaria.

      Question 12. Supplemental Table S1. I would suggest replacing "Percent inhibition by MIPS2673 of PfA-M1 and Pv-M1 aminopeptidases compared to selected human M1 homologues" with "Percent inhibition by MIPS2673 of PfA-M1 and Pv-M1 aminopeptidase activities compared to selected human M1 homologues".

      Done.

      Question 13. Supplemental Table S3. Here you indicate IC50 while in text and Figure 1 you quote EC50. Why this difference?

      This has now been changed to EC50 in Supplemental Table S3.

      Reviewer #2 (Recommendations For The Authors):

      Amendments that I would recommend in order to improve the presentation include all four parts of the study:

      (1) In vitro antiparasitic activity of MIPS2673.

      The authors showed that MIPS2673 inhibits parasite growth with IC50 of 324nM measured by a standard drug sensitivity assay, Fig 1C. This is all well and good, but it would be helpful to include at least one if not more other compounds such as antimalaria drugs and/or their earlier inhibitors (e.g. inhibitor 1) for comparisons. This is typically done to show that the assay in this manuscript is fully compatible with previous studies. It will also give a better view of how the selective inhibition of PfM1 kills the parasite, specifically.

      Alongside MIPS2673, we also analysed the potency of the known antimalarial artesunate, which was found to have an EC50 of 4 nM. This value agrees with the expected potency of artesunate and indicates our MIPS2673 value of 324 nM is indeed compatible with previous studies. We have now reported the artesunate EC50 value for reference (lines 197-198 and Fig. S1).

      Next, the authors proceeded to investigate the stage-specific effect of MIPS2673 but this time doing a survival assay instead of proper IC50 estimations (Figure 1. I wonder why? Drug survival assays have typically very limited information content and measuring proper IC50 in stage-specific wash-off assays would be much more informative.

      We performed single concentration stage specificity assays to determine the parasite asexual stage at which MIPS2673 is most active. This involved washing off the compound after a 24 h exposure in rings or trophozoites and determining parasite viability in the next asexual lifecycle. While a full dose response curve would allow generation of an EC50 value against the respective parasite stages, this information is unlikely to change the interpretation that MIPS2673 is more active against trophozoites stages than against rings.

      Finally, in Figure 1E, the authors present the fact that the MIPS2673 arrests the parasite development. This is done by presenting a single (presumably representative) cell per time point. This is in my view highly insufficient. I recommend this figure be supplemented by parasite stage counts or other more comprehensive data representation. Also, the authors mention that while there is a growth arrest, hemoglobin is still being made. From the cell images, I can not see anything that supports this statement.

      We thank the reviewer for this constructive comment and they are correct in their assessment that these are representative parasite images at the respective time points. To address the reviewers concerns we have now provided cell counts from each treatment condition (Fig. 1E) at selected time points, which shows parasite stalling at the ring to trophozoite transition under drug treatment. On reflection, we agree that it is difficult to determine the presence of haemozoin from our images and have removed this statement.

      (2) Protein thermal shift profiling. In the next step, the authors proceed to carry out cellular thermal shift profiling to show that PfM1 indeed interacts with MIPS2673, this time in the context of the total protein lysates from P. falciparum. This section of the study is in my view quite solid and indeed it is nice to see that the inhibitor causes a thermal shift of PfM1 which further supports what was already expected: interaction.

      I have no problem with this study in terms of the technical outcome but I would urge the authors to tone down the interpretation of these results in two ways.

      Four other proteins were found to be shifted by the inhibitor which also indicates interactions. Calling it simply "off-target" interactions might not represent the truth. The authors should explore and in some way comment that interactions with these proteins could contribute to the MIPS2673 MOA. I do not suggest conducting any more studies but simply acknowledge this situation. Identifying more than one target is indeed very common in CETSA studies and it would be helpful to acknowledge this here as well.

      We agree that identifying binding proteins in addition to the “expected” target is commonplace, and is indeed one of the benefits of this unbiased and proteome-wide approach. In the results and discussion, we have now amended our language to refer to these additional hits as MIPS2673-interacting proteins. In our original manuscript we dedicate a paragraph in the discussion to these additional interacting proteins and the likelihood of them being targets that contribute to antimalarial activity. Of these four additional interacting proteins, only the putative AP2 domain transcription factor (PF3D7_1239200) is predicted to be essential for blood stage growth and is therefore the only protein from this additional four that would likely contribute to antimalarial activity. These points are explicitly stated in the discussion (lines 530-550). Notably, all of the other interacting proteins identified in our thermal stability dataset were detected in our LiP-MS experiment but were not identified as interacting proteins by this method. The remaining three proteins were two non-essential P. falciparum proteins with unknown functions (PF3D7_1026000 and PF3D7_0604300) that are poorly described in the literature and a human protein (RAB39A). Further analysis of these other thermal stability proteomics hits in our LiP-MS dataset (see responses to Reviewer #3) identified none or only 1 significant LiP peptide from these proteins across our LiP-MS datasets, indicating they are likely to be false positive hits. Caveats around identifying protein targets by different deconvolution methods are also now addressed (lines 545-550).

      At some point, the author argues that causing shifts of only four/five proteins including PfM1 shows that MIPS2673 does not interact with other (off) targets. Here one must be careful to present the lack of shifts in the CETSA as proof of no interaction. There are many reasons why thermal shifts are not observed including the physical properties of the individual proteins, detection limit etc. Again I suggest adjusting these statements accordingly.

      We thank the reviewer for raising this important point and have now included additional discussion around this comment (lines 545-550).

      Finally, I am not convinced that Figure 2 presents nothing more than the overall experimental scheme with not much new information. Many of such schemes were published previously in the original publication of thermal profiling. I would suggest omitting it from the main text and shifting it into supplementary methods etc.

      We agree that similar schemes have been published previously, especially for thermal proteome profiling, and acknowledge the reviewer’s suggestion of moving this figure to the supplemental material. However, we have kept Fig. 2 in the main text as this scheme also incorporates a LiP-MS workflow for malaria drug target deconvolution (the first to do so) and also to satisfy the additional details requested for this figure by Reviewer #1 (question 3).

      (3) Identification of MIPS2673 target proteins using LiP-MS. In the next step, the authors carried out the limited proteolysis analysis with the rationale that protein peptides that are near the inhibitor binding site will exhibit higher resilience to proteolysis. The authors did a very good job of showing this for PfM1-MISP2673 interaction. This part is very impressive from a technological perspective, and I congratulate the authors on such achievement. I imagine these types of studies require very precise optimizations and performance.

      Here, however, I struggle with the meaning of this experiment for the overall flow of the manuscript. It seems that the binding pocket of MIPS2673 is less known since the inhibitor was designed for it. In fact, the authors mentioned that the crystal structure of PfM1 is available. From this perspective, the LiP-MS study represents more of a technical proof of concept for future drug target analysis but has limited contribution to the already quite well-established PfM1-MISP2673 interaction. Perhaps this could be presented in this way in the text.

      We thank the reviewer for this comment and they are correct that we solved the crystal structure of PfA-M1 bound to MIPS2673. We wish to highlight that the primary reason for performing the LiP-MS study was as an independent and complementary target deconvolution method to narrow down the shortlist of targets identified with thermal stability proteomics, and validate with high confidence that PfA-M1 is indeed the primary target of MIPS2673 in parasites. The use of a complementary approach based on a different biophysical principle (proteolytic susceptibility vs thermal stability) would also allow us to identify MIPS2673 interacting proteins that may not be detectable by thermal stability proteomics, for example targets that do not alter their thermal stability upon ligand binding. The text in the results and discussion has been amended to clarify these points (lines 266-268 and 545-550).

      Furthermore, we agree that correctly predicting the MIPS2673 binding site on PfA-M1 using our LiP-MS peptide data is a technical proof of concept. Indeed, we wished to highlight the potential utility of LiP-MS for identifying both the protein targets of drugs and predicting their binding site, which is not possible with many other target deconvolution approaches. This point has been updated in the text (lines 303-304, 459-461).

      (4) Metabolomic profiling of MIPS2673 inhibition showed a massive accumulation of short peptides which clearly indicates that this inhibitor blocks some proteolytic activity of short peptides, presumably products of upstream proteolytic activities. Here the authors argue, that because many of these detected short (di-/tri-) peptides could be mapped on the hemoglobin protein sequence, this must be their origin. Although this might be the case the author could not exclude the fact that at least some of these come from other sources (e.g. Plasmodium proteins). It would be quite helpful to comment on such a possibility as well. In particular, it was mentioned that the main subcellular localization of PfM1 is in the cytoplasm while most if not all hemoglobin digestion occurs in the digestive vacuole...?

      Indeed, we agree that Pf_A-M1 is likely processing both Hb and non-Hb peptides and do not definitively conclude that all dysregulated peptides must be derived from haemoglobin. A subset of dysregulated peptides cannot be mapped to haemoglobin and must have an alternative source such as other host proteins or turnover of parasite proteins. We have amended the discussion to better reflect these possible alternate peptide sources (480-482). Although the peptides detected in the metabolomics study (2-5 amino acids) are too short to be definitively assigned to any specific parasite or RBC protein, it is important to note that our analysis strongly indicates that the majority, but not all, of dysregulated peptides are more likely to originate from haemoglobin than other human or parasite proteins. This is based on sequence mapping, which was aided by acquiring MS/MS data for a subset of dysregulated peptides from which we derive accurate sequences (as opposed to residue composition inferred from total peptide mass) to more directly link dysregulated peptides to haemoglobin. We further quantified the sequence similarity of dysregulated peptides to all detectable proteins in the _P. falciparum infected erythrocyte proteome (~4700 proteins), showing that these peptides are statistically more similar to haemoglobin than other host or parasite proteins.

      The apparent disconnect between PfA-M1 localisation (cytosol) and the predominant site of haemoglobin digestion (digestive vacuole, DV) is explained by the fact that peptides originating from digestion of haemoglobin in the DV are required to be transported into the cytoplasm for further cleavage by peptidases, including PfA-M1. This point has now been clarified in the discussion (lines 473-474).

      Reviewer #3 (Recommendations For The Authors):

      (1) Thermal stability studies confirmed that PfA-M1 was a binding target, however, there were other proteins consistently identified in the thermal stability studies. This raises the question as to their potential role as additional targets of this inhibitor. The authors dismiss these because they are not metalloproteases, but further analysis is warranted. This is particularly important as the authors were not able to generate mutants using in vitro evolution of resistance strategies. This often indicates that the inhibitor has more than one target.

      We thank the reviewer for this comment. The possibility of other targets contributing to MIPS2673 activity was also raised by Reviewer #2 (question 2) and is addressed above. Further to our response to Reviewer #2, we agree that the inability to generate resistant parasites in vitro could indicate that inhibition of multiple essential parasite proteins (including PfA-M1) contribute to MIPS2673 activity and do not rule out this possibility. It may also indicate the target has a very high barrier for resistance and is unable to tolerate resistance causing mutations as they are deleterious to function. Indeed, previous attempts to mutate PfA-M1 (references 12 and 50), and our own attempts to generate MIPS2673 resistant parasites in vitro (unpublished), were unsuccessful. It is important to note that of the hits reproducibly identified using thermal stability proteomics, only PfA-M1 and a putative AP2 domain transcription factor (PF3D7_1239200) are predicted to be essential for blood stage growth. We have explicitly stated that PF3D7_1239200 could also contribute to activity (line 533 and 537).

      As we identified multiple hits with thermal stability proteomics we employed the complementary LiP-MS method to further investigate the target landscape of MIPS2673. PfA-M1 was the only protein reproducibly identified as the target through this approach. Importantly, the five proteins identified as hits by thermal stability proteomics were also detected in our LiP-MS datasets, but only PfA-M1 was identified as a target by both target deconvolution methods, strongly indicating it is the primary target of MIPS2673 in parasites. An important caveat is that we profiled the soluble proteome (we did not include detergents necessary for extracting membrane proteins as they may interfere with these stability assays) and other factors (e.g. the biophysical properties of the protein) will impact on whether ligand induced stabilisation events are detected. We have added additional text in the discussion around the above points (lines 545-550).

      While we do not definitively rule out other MIPS2673 interacting proteins existing in parasites (that possibly also contribute to activity), our metabolomics studies indicated no functional impact by MIPS2673 outside of elevated levels of short peptides. This is indicative of aminopeptidase inhibition and the profile of peptide accumulation was distinct from a known PfA-M17 inhibitor, and other antimalarials, further pointing to selective inhibition of the PfA-M1 enzyme by MIPS2673 being responsible for antimalarial activity.

      (2) The next set of experiments focused on a limited proteolysis approach. Again several proteins were identified as interacting with MIPS2673 including metalloproteases. The authors go on to analyze the LiP-MS data to identify the peptide from PfA-M1 which putatively interacts with MIPS2673. The authors are clearly focused on PfA-M1 as the target, but a further analysis of the other proteins identified by this method would be warranted and would provide evidence to either support or refute the authors' conclusions.

      As PfA-M1 was the only protein reproducibly identified as an interacting protein across both LiP-MS experiments (and by thermal stability proteomics) we focused our analysis on this protein. However, we agree that further analysis of the other putative interacting proteins would be valuable. Additional analysis was performed  (see new figure S4) on the other interacting proteins identified by thermal stability proteomics and the other interacting proteins identified in LiP-MS experiment one, as no other proteins (apart from PfA-M1) were identified as hits in the second LiP-MS experiment (lines 314-318, 495-505, 740-762 and Fig. S4). Using the common peptides detected across both LiP-MS experiments we mapped significant LiP peptides to the structures of the other putative MIPS2673-interacting proteins, where a structure was available and significant LiP-MS peptides were detected, and measured the minimum distance to expected binding sites. It is noted that when using the same criteria for a significant LiP peptide that we used for our PfA-M1 analysis, only one significant LiP peptide is identified from these other putative interacting proteins (YSPSFMSFK from PfADA). Therefore, we used a less stringent criteria for defining significant LiP peptides for these other proteins (see methods and Fig. S4 legend) in order to identify significant LiP peptides to map to structures. This analysis showed that, with the exception of PfA-M17, significant LiP-MS peptides for these other proteins are not significantly closer to binding sites than all other detected peptides, supporting our assertion that these other proteins are likely to be false positives or not functionally relevant MIPS2673 interacting proteins. Although significant peptides from PfA-M17 were closer to the binding site, our thermal stability and metabolomics data, combined with our previous work on the PfA-M17 enzyme, argue against this being a functionally relevant target (see lines 362-374 and 486-529 for a more detailed discussion). Another possible explanation for this result is that peptide substrates accumulating due to primary inhibition of PfA-M1 interact with PfA-M17, leading to structural changes around the enzyme active site that are detected by LiP-MS.

      (3) The final set of experiments was an untargeted metabolomics analysis. They identified 97 peptides as significantly dysregulated after MIPS2673 treatment of infected cells and most of these peptides were derived from one of the hemoglobin chains. The accumulation of peptides was consistent with a block in hemoglobin digestion. This experiment does reveal a potential functional confirmation, but questions remain as to specificity.

      As indicated, the accumulation of short peptides identified by metabolomics suggests MIPS2673 perturbs aminopeptidase function. Many of these peptides (but not all) likely map to haemoglobin and are more haemoglobin-like than other proteins in the infected red blood cell proteome. An effect on a subset of non-haemoglobin peptides is also apparent and we have added this to our discussion (also refer to our response to question 4 from Reviewer #2). A direct comparison to our previous metabolomics analysis of a specific PfA-M17 inhibitor (MIPS2571, reference 11) revealed MIPS2673 induces a unique metabolomic profile. The extent of peptide accumulation differed and a subset of short basic peptides (containing Lys or Arg) were elevated only by MIPS2673, consistent with the broad substrate preference of PfA-M1. Importantly, the metabolomics profile induced by MIPS2673 is the opposite of many other antimalarials, which cause depletion of haemoglobin peptides. Taken together, the profile of short peptide accumulation induced by MIPS2673 is consistent with specific inhibition of PfA-M1.

      (4) Overall, this is an interesting series of experiments that have identified a putative inhibitor of PfA-M1 and PvA-M1. The work would be significantly strengthened by structure-aided analysis. It is unclear why putative binding sites cannot be analyzed via specific mutagenesis of the recombinant enzyme.

      Contrary to this comment we solved the crystal structure of PfA-M1 bound to MIPS2673, determining its binding mechanism to the enzyme. This was further supported through proteomics-based structural analysis by LiP-MS. Undertaking site specific mutagenesis would be interesting to further probe the binding dynamics of MIPS2673 to the M1 protein. However, we believe it is beyond the scope of this study and would not change our conclusion that MIPS2673 binds to PfA-M1, which we have shown using multiple unbiased proteomics-based methods, enzyme assays and X-ray crystallography.

      (5) In the thermal stability and LiP -MS analysis, other proteins were consistently identified in addition to PfA-M1 and yet no additional analysis was undertaken to explore these as potential targets.

      As addressed in our previous responses, across independent thermal stability proteomics experiments we consistently identified 5 interacting proteins, including the expected target PfA-M1. In contrast, only PfA-M1 was reproducible across independent LiP-MS experiments. While several plausible putative targets (including aminopeptidases and metalloproteins) were identified in one of our LiP-MS experiment, they appear to be false discoveries and not responsible for the antiparasitic activity of MIPS2673, as peptide-level stabilisation was not consistent across independent LiP-MS experiments, and an interaction is refuted by our thermal stability, metabolomics and recombinant enzyme inhibition data. We have now performed further analysis of these other putative interacting proteins, which also argues against them being likely interacting proteins (see also response to question 2). We have also added to our existing discussion on possible MIPS2673 targets and the likelihood of these proteins contributing to antimalarial activity (lines 486-550).

      (6) The metabolomics experiments were potentially interesting, but without significant additional work including different lengths of treatment and different stages of the parasite, the conclusions drawn are overstated. Many treatments disrupt hemoglobin digestion - either directly or indirectly and from the data presented here it is premature to conclude that treatment with MIPS2673 directly inhibits hemoglobin digestion.

      Our metabolomics studies were performed using typical experimental conditions for investigating the antimalarial mechanisms of compounds by metabolomics (see references 11, 39, 40 and 55-57). We used a short 1 h incubation at 3x EC50 allowing us to profile the primary parasite pathways affected by MIPS2673 and avoid a nonspecific death phenotype associated with longer incubations. As addressed in our response to Reviewer #1 (question 2) we focused on trophozoite infected red blood cells as this is the stage most susceptible to MIPS2673 and when one presumes the greatest functional impact would be seen. It is possible that an expanded kinetic metabolomics analysis may reveal secondary mechanisms involved in MIPS2673 activity and we have now acknowledged this in the manuscript (lines 515-516). However, even though secondary mechanisms may become apparent at longer incubations it also becomes difficult to uncouple drug specific responses from nonspecific death effects. We believe any additional information provided by an expanded metabolomics analysis is unlikely to outweigh the significant extra financial cost associated with this type of experiment.

      It is correct that many antimalarial compounds appear to disrupt haemoglobin digestion when analysed by metabolomics. However, as indicated in our manuscript (lines 369-373) and previous responses, the profile of elevated haemoglobin peptides induced by MIPS2673 is substantially different to the profile caused by other antimalarials. For example, artemisinins and mefloquine cause haemoglobin peptide depletion (references 55-57) and chloroquine results in increased levels of a different subset of non-haemoglobin peptides (see Creek et al. 2016). While there is some overlap in profile with a selective M17 inhibitor (our previous work, reference 11), the level of enrichment of these peptides is different and MIPS2673 also induces accumulation of a distinct set of basic peptides consistent with the substrate preference of the PfA-M1 enzyme. As we show that MIPS2673 does not inhibit other parasite aminopeptidases, a likely explanation for the profile overlap is that the build-up of substrates that cannot be processed by PfA-M1 leads to secondary dysregulation of other aminopeptidases. Our analyses (sequence mapping, MS/MS analysis and sequence similarities to all infected red blood cell proteins) strongly indicate that the majority of elevated peptides (but not all) originate from haemoglobin. Combined with our proteomics and recombinant enzyme data indicating direct engagement of PfA-M1, and with previous literature indicating the enzyme functions to cleave amino acids from haemoglobin-derived peptides, our data indicates MIPS2673 likely directly perturbs the haemoglobin digestion pathway through PfA-M1 inhibition.

      (7) Finally, the potency of this compound on parasites grown in vitro is 300 nM - this would need improvements in potency and demonstration of in vivo efficacy in the SCID mouse model to consider this a candidate for a drug.

      We do not propose MIPS2673 as an antimalarial candidate. The experiments presented here were centred on target validation rather than identification of an antimalarial lead, which may be the focus of future studies. To avoid this confusion, we have amended the manuscript title and language throughout to clarify this point.

    2. eLife assessment

      The manuscript makes an important contribution to antimalarial drug discovery, utilizing diverse systems biology methodologies. It focuses on an improved M1 metalloprotease inhibitor and provides compelling evidence for the utility of chemoproteomics in pinpointing PfA-M1 targeting. Additionally, metabolomic analysis reveals specific alterations in the final steps of hemoglobin breakdown. These findings highlight the potential of the developed methodology not only for PfA-M1 targeting but also for other inhibitors targeting various malarial proteins or pathways.

    3. Reviewer #1 (Public Review):

      By using a series of biochemical methods based on proteomic and metabolomic approaches, this study aims at: (1) validating the specific targeting of a biologically active molecule (MIPS2673) towards a defined (and unique?) protein target within a parasite, and (2) exploring whether it is possible to extrapolate which metabolic pathway has been disrupted.

      Strength/Weaknesses

      -The chemoproteomic approach, convincingly shows that MIPS2673 more significantly "protects" the putative target (PfA-M1) against thermal degradation or against enzymatic attack (by proteinase K). Proteomic studies are carried using parasite extracts enriched in late trophozoites (30-38 h pi), and are restricted to the soluble proteins fraction.<br /> -The metabolomic approach, documents the ability of MIPS2673 to selectively increase the number of non-hydrolyzed dipeptides in treated versus untreated parasites, further arguing for selective targeting of PfA-M1 and impairment of hemoglobin breakdown by the parasite.<br /> -The revised version now also considers and further studies the additional putative targets identified by one proteomic approach (but not the other one), which is both more critical of the results obtained and more realistic.<br /> The work as a whole is highly interesting, both for the specific topic of PfA-M1's role in parasite biology and for the method, applicable to other malarial drug contexts.

    4. Reviewer #2 (Public Review):

      In this manuscript, the authors first developed a new small molecular inhibitor that could target specifically the M1 metalloproteases of both important malaria parasite species Plasmodium falciparum and P. vivax. This was done by a chemical modification of a previously developed molecule that targets PfM1 as well as PfM17 and possibly other Plasmodial metalloproteases. After the successful chemical synthesis, the authors showed that the derived inhibitor, named MIPS2673, has a strong antiparasitic activity with IC50 342 nM and it is highly specific for M1. With this in mind, the authors first carried out two large-scale proteomics to confirm the MIPS2673 interaction with PfM1 in the context of the total P. falciparum protein lysate. This was done first by using thermal shift profiling and subsequently limited proteolysis. While the first demonstrated overall interaction, the latter (limited proteolysis) could map more specifically the site of MIPS2673-PfM1 interaction, presumably the active site. Subsequent metabolomics analysis showed that MIPS2673 cytotoxic inhibitory effect leads to the accumulation of short peptides many of which originate from hemoglobin. Based on that the authors argue that the MIPS2673 mode of action (MOA) involves inhibition of hemoglobin digestion that in turn inhibits the parasite growth and development.

      Comments on the revised version:

      The authors addressed all my comments from the previous round of reviews.

    5. Reviewer #3 (Public Review):

      Summary:

      Overall, this is an interesting series of experiments which have identified a putative inhibitor of the Plasmodium M1 alanyl aminopeptidases, PfA-M1 and PvA-M1. The weaknesses include the lack of additional analysis of additional targets identified in the chemoproteomic approaches.

      Strengths:

      The main strengths include the synthesis of MIPS2673 which is selectively active against the enzyme and in whole cell assay.

      Weaknesses:

      The authors have addressed the previously identified weaknesses and have now provided additional data and explanations. They have modified their conclusions to indicate the limitations of their work.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      Khan et. al., investigated the functional redundancy of the non-canonical L-cysteine synthases of M. tuberculosis, CysM and CysK2, focussing on their role in mitigating the effects of host-derived stress. They found that while deletion mutants of the two synthases (Rv∆cysM, Rv∆cysK2) have similar transcriptomes under standard conditions, their transcriptional response to oxidative stress is distinct. The impact of deleting the synthases also differentially affected the pools of L-cysteinederived metabolites. They show that the mutants (Rv∆cysM, Rv∆cysK2) have impaired survival in peritoneal macrophages and in a mouse model of infection. Importantly, they show that the survival of the mutants increases when the host is defective in producing reactive oxygen and nitrogen species, linking the phenotype to a defect in combating host-derived stress. Finally, they show that compounds inhibiting L-cysteine synthases reduce the intracellular survival of M.

      tuberculosis.

      Strengths:

      (1) The distinct transcriptome of the Rv∆cysM and Rv∆cysK2 mutants in the presence of oxidative stress provides solid evidence that these mutants are distinct in their response to oxidative stress, and suggests that they are not functionally redundant.

      (2) The use of macrophages from phox-/- and INF-/- mice and an iNOS inhibitor for the intracellular survival assays provides solid evidence that the survival defect seen for the Rv∆cysM and Rv∆cysK2 mutants is related to their reduced ability to combat host-derive oxidative and nitrosative stress. This is further supported by the infection studies in phox-/- and INF-/- mice.

      Weaknesses:

      (1) There are several previous studies looking at the transcriptional response of M. tuberculosis to host-derived stress, however, the authors do not discuss initial RNA-seq data in the context of these studies. Furthermore, while several of the genes in sulfur assimilation and L-cysteine biosynthetic pathway genes are upregulated by more than one stress condition, the data does not support the statement that it is the "most commonly upregulated pathway in Mtb exposed to multiple host-like stresses".

      We have made changes in the manuscript in line with reviewer’s suggestion.  

      “Thus RNA-Seq data suggest that genes involved in sulfur assimilation and L-cysteine biosynthetic pathway are upregulated during various host-like stresses in Mtb (Figure S2). Given the importance of sulphur metabolism genes in in vivo survival of Mtb [1, 2], it is not surprising that these genes are dynamically regulated by diverse environment cues. Microarray studies have shown upregulation of genes encoding sulphate transporter upon exposure to hydrogen peroxide and nutrient starvation [3-7] Similarly, ATP sulfurlyase and APS kinase is induced during macrophage infection and by nutrient depletion. Induction of these genes that coordinate first few steps of sulphur assimilation pathway indicate that probable increase in biosynthesis of sulphate containing metabolites that may be crucial against host inflicted stresses. Furthermore, genes involved in synthesis of reduced sulphur moieties (cysH, sirA and cysM) are also induced by hydrogen peroxide and nutrient starvation. Sulfur metabolism has been postulated to be important in transition to latency. This hypothesis is based on transcriptional upregulation of cysD, cysNC, cysK2, and cysM upon exposure to hypoxia. Multiple transcriptional profiling studies have reported upregulation of moeZ, mec, cysO and cysM genes when cells were subjected to oxidative and hypoxic stress [1, 6-11] further suggesting an increase in the biosynthesis of reduced metabolites such as cysteine and methionine and sulfur containing cell wall glycolipids upon exposure to oxidative stress [12]. We have modified the sentence to “significantly upregulated pathway in Mtb exposed to multiple host-like stresses”

      (2) For the quantification of the metabolites, it isn't clear how the abundance was calculated (e.g., were standards for each metabolite used? How was abundance normalised between samples?), and this information should be included to strengthen the data.

      Thanks for picking up this. We have extended our description of metabolomics methods. It now reads: “Due to the tendency of M. tuberculosis to form clamps, which significantly skews any cell number estimation we normalized samples to protein/peptide concentration using the BCA assay kit (Thermo). Therefore, our LC-MS data is expressed as ion counts/mg protein or ratios of that for the same metabolite. This is a standard way to express ion abundance data as it was done previously [13, 14].

      Furthermore, labelling with L-methionine was performed to determine the rate of synthesis of the L-cysteine-derived metabolites. L-cysteine is produced from L-methionine via the transsulfuration pathway, which is independent of CysM and CysK2. It is therefore difficult to interpret this experiment, as the impact of deleting CysM and CysK2 on the transsulfuration pathway is likely indirect.

      The reviewer may have misunderstood the experiment and the results presented. Labelling was not performed with L-methionine. We use 34S derived from SO42-, to monitor reductive assimilation of sulfur and its transit from S2- until L-methionine, passing through cysteine. We specified in material and methods that we have used sodium sulfate-34S (Merck 718882), as our label source of sulfur. This method was first employed in M. tuberculosis by the Bertozzi group to identify sulfolipids in mycobacteria. Therefore, we are not measuring transsulfuration, but instead direct synthesis of L-methionine via cysteine, and consequently we are indeed assessing the importance of cysK2 and cysM in this process. We have now added to the results section (page 9) that we employed (Na34SO4) for labeling, to make sure other readers will not think we are measuring transulfuration.

      (3) The ability of L-cysteine to rescue the survival defect of the Rv∆cysM and Rv∆cysK2 mutants in macrophages is interpreted as exogenous L-cysteine being able to compensate for reduced intracellular levels. However, there is no evidence that L-cysteine is being taken up by the mutants and an alternate explanation is that L-cysteine functions as an antioxidant within cells i.e., it reduces intracellular ROS.

      The concentration of L-cysteine used for peritoneal macrophage survival rescue experiments was titrated to have no minimum survival advantage in case of wild-type Rv. Thus, at the given concentration, we believe that the contribution of cysteine in reducing intracellular ROS within cells does not have a major role since there is no significant difference in the survival of wild-type Rv strain. Had cysteine reduced intracellular ROS, we would expect increased bacterial survival of Rv due to diminished oxidative stress. 

      Furthermore, L-cysteine addition also mitigates CHP induced survival defect in vitro [15] and nullifies observed effect of Cysteine inhibitors in vitro [16] suggesting that cysteine or cystine can be transported into Mtb. This has also been previously shown in case of AosR mutant strain [15], CysH [2] and over 70% uptake of exogenously added [35S] cysteine to a growing culture of Mtb [17].

      The authors sought to investigate the functional redundancy of the non-canonical L-cysteine synthases CysM and CysK2. While their distinct transcriptional response to oxidative stress suggests distinct physiological roles, the study did not explore these differences and therefore provides only preliminary insight into the underlying reasons for this observation. In the context of drug development, this work suggests that while L-cysteine synthase inhibitors do not have high potency for killing intracellular M. tuberculosis, they have the potential to decrease the pathogen's survival in the presence of host-derive stress.

      Reviewer #2 (Public Review):

      Summary:

      The paper examines the role L-cysteine metabolism plays in the biology of Mycobacterium tuberculosis. The authors have preliminary data showing that Mycobacterium tuberculosis has two unique pathways to synthesize cysteine. The data showing new compounds that act synergistically with INH is very interesting.

      Strengths:

      RNAseq data is interesting and important.

      Weaknesses:

      The paper would be strengthened if the authors were to add further detail to their genetic manipulations.

      The authors provide evidence that they have successfully made a cysK2 mutant by recombineering. This data looks promising, but I do not see evidence for the cysM deletion. It is also important to state what sort of complementation was done (multicopy plasmid, integration proficient vector, or repair of the deletion). Since these mutants are the basis for most of the additional studies, these details are essential. It is important to include complementation in mouse studies as unexpected loss of PDIM could have occurred.

      The details of CysM knockout generation have been previously published ([15]; Appendix Figure S4), and complementation strain details are provided in the methods section.  

      Reviewer #3 (Public Review):

      In this work, the authors conduct transcriptional profiling experiments with Mtb under various different stress conditions (oxidative, nitrosative, low pH, starvation, and SDS). The Mtb transcriptional responses to these stress conditions are not particularly new, having been reported extensively in the literature over the past ~20 years in various forms. A common theme from the current work is that L-cysteine synthesis genes are seemingly up-regulated by many stresses. Thus, the authors focused on deleting two of the three L-cysteine synthesis genes (cysM and cysK2) in Mtb to better understand the roles of these genes in Mtb physiology.

      The cysM and cysK2 mutants display fitness defects in various media (Sautons media, starvation, oxidative and nitrosative stress) noted by CFU reductions. Transcriptional profiling studies with the cysM and cysK2 mutants revealed that divergent gene signatures are generated in each of these strains under oxidative stress, suggesting that cysM and cysK2 have non-redundant roles in Mtb's oxidative stress response which likely reflects the different substrates used by these enzymes, CysO-L-cysteine and O-phospho-L-serine, respectively. Note that these studies lack genetic complementation and are thus not rigorously controlled for the engineered deletion mutations.

      The authors quantify the levels of sulfur-containing metabolites (methionine, ergothioneine, mycothiol, mycothionine) produced by the mutants following exposure to oxidative stress. Both the cysM or cysK2 mutants produce more methionine, ergothioneine, and mycothionine relative to WT under oxidative stress. Both mutants produce less mycothiol relative to WT under the same condition. These studies lack genetic complementation and thus, do not rigorously control for the engineered mutations.

      Next, the mutants were evaluated in infection models to reveal fitness defects associated with oxidative and nitrosative stress in the cysM or cysK2 mutants. In LPS/IFNg activated peritoneal macrophages, the cysM or cysK2 mutants display marked fitness defects which can be rescued with exogenous cysteine added to the cell culture media. Peritoneal macrophages lacking the NADPH oxidase (Phox) or IFNg fail to produce fitness phenotypes in the cysM or cysK2 mutants suggesting that oxidative stress is responsible for the phenotypes. Similarly, chemical inhibition of iNOS partly abrogated the fitness defect of the cysM or cysK2 mutants. Similar studies were conducted in mice lacking IFNg and Phox establishing that cysM or cysK2 mutants have fitness defects in vivo that are dependent on oxidative and nitrosative stress.

      Lastly, the authors use small molecule compounds to inhibit cysteine synthases. It is demonstrated that the compounds display inhibition of Mtb growth in 7H9 ADC media. No evidence is provided to demonstrate that these compounds are specifically inhibiting the cysteine synthases via "ontarget inhibition" in the whole Mtb cells. Additionally, it is wrongly stated in the discussion that "combinations of L-cys synthase inhibitors with front-line TB drugs like INH, significantly reduced the bacterial load inside the host". This statement suggests that the INH + cysteine synthase inhibitor combinations reduce Mtb loads within a host in an infection assay. No data is presented to support this statement.

      We agree with the reviewer that the experiments do not conclusively prove that these compounds specifically inhibit the cysteine synthases via "on-target inhibition" in the whole Mtb cells. However, the inhibitors used in this study have been previously profiled in vitro (https://www.sciencedirect.com/science/article/abs/pii/S0960894X17308405?via%3Dihub).  We have modified the sentence to “a combination of L-cysteine synthase inhibitors with front-line TB drugs like INH, significantly reduced the bacterial survival in vitro”

      References

      (1) Hatzios, S.K. and C.R. Bertozzi, The regulation of sulfur metabolism in Mycobacterium tuberculosis. PLoS Pathog, 2011. 7(7): p. e1002036.

      (2) Senaratne, R.H., et al., 5'-Adenosinephosphosulphate reductase (CysH) protects Mycobacterium tuberculosis against free radicals during chronic infection phase in mice. Mol Microbiol, 2006. 59(6): p. 1744-53.

      (3) Betts, J.C., et al., Evaluation of a nutrient starvation model of Mycobacterium tuberculosis persistence by gene and protein expression profiling. Mol Microbiol, 2002. 43(3): p. 717-31.

      (4) Hampshire, T., et al., Stationary phase gene expression of Mycobacterium tuberculosis following a progressive nutrient depletion: a model for persistent organisms? Tuberculosis (Edinb), 2004. 84(3-4): p. 228-38.

      (5) Schnappinger, D., et al., Transcriptional Adaptation of Mycobacterium tuberculosis within Macrophages: Insights into the Phagosomal Environment. J Exp Med, 2003. 198(5): p. 693-704.

      (6) Voskuil, M.I., et al., The response of mycobacterium tuberculosis to reactive oxygen and nitrogen species. Front Microbiol, 2011. 2: p. 105.

      (7) Voskuil, M.I., K.C. Visconti, and G.K. Schoolnik, Mycobacterium tuberculosis gene expression during adaptation to stationary phase and low-oxygen dormancy. Tuberculosis (Edinb), 2004. 84(3-4): p. 218-27.

      (8) Brunner, K., et al., Profiling of in vitro activities of urea-based inhibitors against cysteine synthases from Mycobacterium tuberculosis. Bioorg Med Chem Lett, 2017. 27(19): p. 4582-4587.

      (9) Manganelli, R., et al., Role of the extracytoplasmic-function sigma factor sigma(H) in Mycobacterium tuberculosis global gene expression. Mol Microbiol, 2002. 45(2): p. 365-74.

      (10) Burns, K.E., et al., Reconstitution of a new cysteine biosynthetic pathway in Mycobacterium tuberculosis. J Am Chem Soc, 2005. 127(33): p. 11602-3.

      (11) Manganelli, R., et al., The Mycobacterium tuberculosis ECF sigma factor sigmaE: role in global gene expression and survival in macrophages. Mol Microbiol, 2001. 41(2): p. 423-37.

      (12) Tyagi, P., et al., Mycobacterium tuberculosis has diminished capacity to counteract redox stress induced by elevated levels of endogenous superoxide. Free Radic Biol Med, 2015. 84: p. 344-354.

      (13) de Carvalho, L.P., et al., Metabolomics of Mycobacterium tuberculosis reveals compartmentalized co-catabolism of carbon substrates. Chem Biol, 2010. 17(10): p. 1122-31.

      (14) Agapova, A., et al., Flexible nitrogen utilisation by the metabolic generalist pathogen Mycobacterium tuberculosis. Elife, 2019. 8.

      (15) Khan, M.Z., et al., Redox homeostasis in Mycobacterium tuberculosis is modulated by a novel actinomycete-specific transcription factor. EMBO J, 2021. 40(14): p. e106111.

      (16) Brunner, K., et al., Inhibitors of the Cysteine Synthase CysM with Antibacterial Potency against Dormant Mycobacterium tuberculosis. J Med Chem, 2016. 59(14): p. 6848-59.

      (17) Wheeler, P.R., et al., Functional demonstration of reverse transsulfuration in the Mycobacterium tuberculosis complex reveals that methionine is the preferred sulfur source for pathogenic Mycobacteria. J Biol Chem, 2005. 280(9): p. 8069-78.

      Recommendations For The Authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) In Figure S1 it would be useful to include the reverse transsulfuration pathway given that it contributes to the L-cysteine pool, and that L-methionine was used for metabolite labelling experiments.

      We are in agreement with the reviewer’s suggestion, and we have included reverse transsulfuration in Fig S1. Please note that Labelling was not performed with L-methionine. We used 34S derived from SO42-to monitor the reductive assimilation of sulfur and its transit from S2- until Lmethionine, passing through cysteine. We specified in material and methods that we have used sodium sulfate-34S (Merck 718882), as our label source of sulfur. This method was first employed in M. tuberculosis by the Bertozzi group to identify sulfolipids in mycobacteria. Therefore, we are not measuring transsulfuration but instead a direct synthesis of Lmethionine via cysteine, and consequently, we are indeed assessing the importance of cysK2 and cysM in this process. We have now added to the results section (page 9) that we employed (Na34SO4) for labeling to make sure other readers will not think we are measuring transulfuration.

      Author response image 1.

      (2) In Figure S2 it is unclear why the control is included in this figure given that the stress conditions were compared to the control. What is the control being compared to here?

      The heat maps of controls have been included to demonstrate relative gene expression in independent/each of the replicates. The normalized count for the differentially expressed genes are plotted. To better understand the RNA-seq results, we plotted the fold change of differentially expressed genes due to different stress conditions (New figure & table- Figure S3 & Table S2). This allowed us to understand the expression profile of genes in all the stress conditions simultaneously, regardless of whether they were identified as differentially expressed. The data revealed that specific clusters of genes are up- and downregulated in oxidative, SDS, and starvation conditions. In comparison, the differences observed in the pH 5.5 and nitrosative conditions were limited (Figure S3 & Table S2).  

      (3) In Figure S3 it would be more informative to show fold-enrichment than gene counts in (b) to (f).

      In our opinion, gene counts are more informative when plotting GO enrichments, as the number of genes in each GO category can vary drastically. The significance values are already calculated based on the fold enrichment of a category compared to the background, and hence, p-adj values plotted on the x-axis can be sort of a proxy for fold enrichment. Hence, instead of plotting two related variables, plotting the total gene counts that belonged to a category is usually helpful for the reader in understanding the “scale” in which a category is affected.

      (4) Figure 1c standard Sautons is a defined media, and is not nutrient-limiting - the authors should clarify the composition of the media that they used here.

      The composition of Sautons media used in the study is 0.5g/L MgSO4.7H20, 2 g/L citric acid, 1g/L L-asparagine, 0.3 g/L KCl.H20, 0.2% glycerol, 0.64 g/L FeCl3, 100 μM NH4Cl and 0.7 g/L K2HPO4.3H20. We have modified the sentence in line with reviewer’s suggestion.  

      (5) The authors claim that the distinct transcriptomes for the two mutants indicate that "CysM and CysK2 distinctly modulate 324 and 1104 genes". The effect is likely due to distinct downstream consequences of the deletions, rather than direct regulation by the synthases. This section should be reworded for clarity.

      We have modified the sentence in line with reviewer’s suggestion.

      (6) In Figure 3 it would be useful to express mycothione levels as a percentage of the total mycothiol pool to give an indication of the extent to which the thiol is being oxidised.

      While we appreciate reviewer’s suggestion, we cannot make ratios of IC for two different compounds, as they ionize different. 100 ion counts of one does NOT equal to 100 ion counts of the other.

      (7) Figure 6 is difficult to interpret as the concentrations used in the INH + inhibitor wells are not clear. It would be useful to indicate the concentrations of each compound added next to the wells in the figure.

      We have modified the figure and legends in line with reviewer’s suggestion

      Reviewer #2 (Recommendations For The Authors):

      (1) Document the cysM deletion.

      The details of CysM knockout generation have been previously published ([15]; Appendix Figure S4), and complementation strain details are provided in the methods section. 

      (2) The oxidative stress CHP is not defined in the figure legend.

      We have modified the legend in line with the reviewer’s suggestion.

      (3) Can we see the structures of the compounds?

      Kindly refer to Fig 6a for the structures of compounds 

      (4) Fix the genetics and the paper is very interesting.

      I might be missing something. The authors do provide promising complementation data for several of the stresses. Provide evidence for the cysM deletion and complementation and the data will be very compelling. The focus of the paper is important for our understanding of the biology of Mycobacterium tuberculosis.

      Thank you for appreciating our study. The details of CysM knockout and complementation strain generation have been previously published ([15]; Appendix Figure S4 & Methods)). CysK2 mutant and complementation strain details are included in the present manuscript (Figure 1b & Methods).

      Reviewer #3 (Recommendations For The Authors):

      The transcriptional profiling studies do not rigorously control for the engineered mutations using genetic complementation.

      The complementation strains used in all in vitro, ex vivo and in vivo experiments showcase that the phenotypes associated with knockouts are gene specific. We choose not to include complementation strains in RNA sequencing experiments due to the large number of samples handling and associated costs.  

      Figure 3. These data are not rigorously controlled without genetic complementation, explain why some data in Figure 3 was generated at 24 hr and other data was generated at 48 hr, remove subbars in 3g. Please provide more clarification on Fig 3e-g because the normalization in these panels makes it appear as if there is little- or no-difference in the levels of 34S incorporation into the thiol metabolites.

      The complementation strains used in all in vitro, ex vivo, and in vivo experiments showcase that the phenotypes associated with knockouts are gene-specific. We chose not to include complementation strains in Figure 3 experiments due to the large number of sample handling and associated costs. 

      The time points in the given experiment were chosen based on an initial pilot experiment. It is apparent that a longer duration is required to see the phenotypes associated with labelling compared to pool size. The differences observed are statistically significant. 

      Surfactant and SDS stress are used interchangeably in the text, legends, and figures. Please be consistent here.

      We have modified the text in line with reviewer’s suggestion.

      Consider re-wording the 1st paragraph on page 5 to better clarify how Trp, Lys, and His interact with the host immune cells.

      We have modified the text in line with reviewer’s suggestion.

      Cite the literature associated with the sulfur import system in Mtb on page 3 in the 2nd paragraph.

      We have modified the text in line with reviewer’s suggestion.

      The manuscript nicely describes the construction of a cysK2 mutant. It is unclear how the cysM mutant was generated. Please clarify, cite, or add the cysM mutant construction to this manuscript.

      The details of CysM knockout and complementation strain generation has been previously published ([15]; Appendix Figure S4 & Methods)). We have included the citation in the methods section of current manuscript.

      Provide evidence that the small molecules used in Fig 6 are on target and inhibit the cysteine biosynthetic enzymes in whole bacteria. It is unclear how a MIC can be determined with these compounds in 7H9 ADC when deletion mutants grow just fine in this media. Is this because the compounds inhibit multiple cysteine synthesis enzymes and/or enzymatic targets in other pathways? To me, the data suggests that the compounds are hitting multiple enzymes in whole Mtb cells. Does cysteine supplementation reverse the inhibitory profiles with the compounds in Figure 6?

      As mentioned in the text, all the compounds were ineffective in killing Mtb, likely because Lcysteine synthases are not essential during regular growth conditions. Hence, the MIC for cysteine inhibitors was very high - C1 (0.6 mg/ml), C2 (0.6 mg/ml), and C3 (0.15 mg/ml) opposed to the standard drug, isoniazid with MIC of 0.06 ug/ml. We agree with the reviewer that the experiments do not conclusively prove that these compounds specifically inhibit the cysteine synthases via "on-target inhibition" in  Mtb cells. The inhibitors used in this study have been previously profiled in vitro [8]. However, one cannot rule out the hypothesis that these compounds might also have some off-target effects.

    2. eLife assessment

      Sulphur atoms derived from cysteine are thought to play significant roles in maintaining redox homeostasis in Mycobacterium tuberculosis, which encounters stresses associated with immune cell interactions. In this valuable manuscript, the authors provide solid evidence that the genes encoding cysteine biosynthetic enzymes (cysM and cysK2) are required to maintain full viability of M. tuberculosis under in vitro stress conditions, macrophage infections, and within the lung tissues of mice. The manuscript presents transcriptomic and metabolomic evidence to support the hypothesis that CysM and CysK2 play distinct roles in maintaining cysteine-derived metabolite pools under stress conditions. The work will be of interest to microbiologists in general.

    3. Reviewer #1 (Public Review):

      Summary:

      Khan et. al., investigated the functional redundancy of the non-canonical L-cysteine synthases of M. tuberculosis, CysM and CysK2, focussing on their role in mitigating the effects of host-derived stress. They found that while deletion mutants of the two synthases (Rv∆cysM, Rv∆cysK2) have similar transcriptomes under standard conditions, their transcriptional response to oxidative stress is distinct. The impact of deleting the synthases also differentially affected the pools of L-cysteine-derived metabolites. They show that the mutants (Rv∆cysM, Rv∆cysK2) have impaired survival in peritoneal macrophages and in a mouse model of infection. Importantly, they show that the survival of the mutants increases when the host are defective in producing reactive oxygen and nitrogen species, linking the phenotype to a defect in combating host-derived stress. Finally, they show that compounds inhibiting L-cysteine synthases reduces intracellular survival of M. tuberculosis.

      Strengths:

      (1) The distinct transcriptome of the Rv∆cysM and Rv∆cysK2 mutants in the presence of oxidative stress provides solid evidence that these mutants are distinct in their response to oxidative stress, and suggests that they are not functionally redundant.<br /> (2) The use of macrophages from phox-/- and INF-/- mice and an iNOS inhibitor for the intracellular survival assays provides solid evidence that the survival defect seen for the Rv∆cysM and Rv∆cysK2 mutants is related to their reduced ability to combat host-derive oxidative and nitrosative stress. This is further supported by the infection studies in phox-/- and INF-/- mice.

      Weaknesses:

      Inclusion of the complemented strains in the metabolite study would strengthen the data. Furthermore, using an alternate method to quantify the MSH:MSSM ratio would provide insight into the redox homoeostasis in mutants in the presence and absence of CHP to support the statement that "deletion or inhibition of CysM or CysK2 perturbs redox homeostasis of Mtb".

      The authors sought to investigate the functional redundancy of the non-canonical L-cysteine synthases CysM and CysK2. While their distinct transcriptional response to oxidative stress suggests distinct physiological roles, the study did not explore these differences, and therefore provides only preliminary insight into the underlying reasons for this observation. In the context of drug development, this work suggests that while L-cysteine synthases inhibitors do not have high potency for killing intracellular M. tuberculosis, they have potential for decreasing the pathogen's survival in the presence of host-derive stress.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      eLife assessment

      This study advances our understanding of the allosteric regulation of anaerobic ribonucleotide reductases (RNRs) by nucleotides, providing valuable new structural insight into class III RNRs containing ATP cones. The cryo-EM structural characterization of the system is solid, but some open questions remain about the interpretation of activity/binding assays and the newly incorporated HDX-MS results. The work will be of interest to biochemists and structural biologists working on ribonucleotide reductases and other allosterically regulated enzymes.

      Public Reviews:

      Reviewer #1 (Public Review):

      The goal of this study is to understand the allosteric mechanism of overall activity regulation in an anaerobic ribonucleotide reductase (RNR) that contains an ATP-cone domain. Through cryo-EM structural analysis of various nucleotide-bound states of the RNR, the mechanism of dATP inhibition is found to involve order-disorder transitions in the active site. These effects appear to prevent binding of substrate and a radical transfer needed to initiate the reaction.

      Strengths of the manuscript include the comprehensive nature of the work - including both numerous structures of different forms of the RNR and detailed characterization of enzyme activity to establish the parameters of dATP inhibition. The manuscript has been improved in a revision by performing additional experiments to help corroborate certain aspects of the study. But these new experiments do not address all of the open questions about the structural basis for mechanism. Additionally, some questions about the strength of biochemical data and fit of binding or kinetic curves to data that were raised by other referees still remain. Some experimental observations are not consistent with the proposed model. For example, why does dATP enhance Gly radical formation when the proposed mechanism of dATP inhibition involves disorder in the Gly radical domain?

      The work is impactful because it reports initial observations about a potentially new mode of allosteric inhibition in this enzyme class. It also sets the stage for future work to understand the molecular basis for this phenomenon in more detail.

      We express our gratitude to the reviewer for dedicating time to review our work and for the overall favorable assessment. We agree that the question of exactly how much the glycyl radical domain becomes more mobile without losing the glycyl radical entirely is an unresolved one but we also think that our work sets a solid basis for future experiments by us and others.

      Reviewer #3 (Public Review):

      The manuscript by Bimai et al describes a structural and functional characterization of an anaerobic ribonucleotide reductase (RNR) enzyme from the human microbe, P. copri. More specifically, the authors aimed to characterize the mechanism by how (d)ATP modulates nucleotide reduction in this anaerobic RNR, using a combination of enzyme kinetics, binding thermodynamics, and cryo-EM structural determination, complemented by hydrogen-deuterium exchange (HDX). One of the principal findings of this paper is the ordering of a NxN 'flap' in the presence of ATP that promotes RNR catalysis and the disordering (or increased protein dynamics) of both this flap and the glycyl radical domain (GRD) when the inhibitory effector, dATP, binds. The latter is correlated with a loss of substrate binding, which is the likely mechanism for dATP inhibition. It is important to note that the GRD is remote (>30 Ang) from the binding site of the dATP molecule, suggesting long-range communication of the structural (dis)ordering. The authors also present evidence for a shift in oligomerization in the presence of dATP. The work does provide evidence for new insights/views into the subtle differences of nucleotide modulation (allostery) of RNR, in a class III system, through long-range interactions.

      The strengths of the work are the impressive, in-depth structural analysis of the various regulated forms of PcRNR by (d)ATP using cryo-EM. The authors present seven different models in total, with striking differences in oligomerization and (dis)ordering of select structural features, including the GRD that is integral to catalysis. The authors present several, complementary biochemical experiments (ITC, MST, EPR, kinetics) aimed at resolving the binding and regulatory mechanism of the enzyme by various nucleotides. The authors present a good breadth of the literature in which the focus of allosteric regulation of RNRs has been on the aerobic orthologues.

      The addition of hydrogen-deuterium exchange mass spectrometry (HDX-MS) complements the results originating from cryo-EM data. Most notably, is the observation of the enhanced exchange (albeit quite subtle) of the GRD domain in the presence of dATP that matches the loss of structural information in this region in the cryo-EM data. The most pronounced and compelling HDX results are seen in the form of dATP-induced protection of peptides immediately adjacent to the b-hairpin at the s-site, where dATP is expected to bind based on cryo-EM. It is clear that the presence of dATP increases the rigidity of this region.

      We are happy that both reviewers find the HDX-MS experiments to be a valuable addition to the existing data.

      Weaknesses:

      The discussion of the change in peptide mobility in the N-terminal region is complicated by the presence of bimodal mass spectral features and this may prevent detailed interpretation of the data, especially for select peptide region that shows opposite trends upon nucleotide association.

      Further, the HDX data in the NxN flap is unchanged upon nucleotide binding (ATP, dATP, or CTP), despite changes observed in the cryo-EM data.

      We are grateful to the reviewer for the comprehensive feedback on the HDX-MS part and for identifying areas for improvement. The HDX analysis was of course undertaken with the intention of identifying differences in disorder of the NxN flap and GRD region. From an HDX perspective both regions were found to be highly susceptible to HDX regardless of state/ligand, due to surface accessibility and/or very fast dynamics. However, this does not mean that there is no difference in the degree of order of these regions upon ligand addition, simply that we with HDX-MS, in the limited time span of 30-3000 seconds, could not conclusively support an increased disorder. We have rephrased the discussion text to reflect this fact

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      On page 5 (and throughout the manuscript) there are some inconsistencies in how dissociation constants for effectors and inhibitors are described - for example, D in KD is sometimes subscripted and sometimes not.

      Thank you for noticing these remaining errors. We hope that we have fixed all of them now.

      Reviewer #3 (Recommendations For The Authors):

      The authors addressed many of the initial concerns raised. The addition of the HDX-MS data in this revision is a welcomed contribution to the work and complements the cryo-EM data. In select cases, the data may be over-interpreted. This reviewer suggests that the authors revise the text in this section so that it is more consistent with the presented data.

      Specific points:

      (1) The bimodal mass spectral features in the N-terminal domain complicate the data interpretation. Specifically for peptides in 81-99 region, the fast exchanging feature shows protection in the presence of (d)ATP/CTP, but the opposite trend is observed for the slow exchanging species. It is therefore advisable to not make absolutes about the HDX results in this region, as the data are complicated.

      As stated by the reviewer, it is not possible from the presented HDX data to deduce if this is a result of 50% loaded dimer or the oligomerization state of the protein. We have remedied this by removing mentions of a difference between the dATP and ATP in bimodality. Also, we have addressed this in the text by stating that the main reason is most likely the different oligomerization states present in solution. Nevertheless, it is clear from the HDX data that the N-terminal region and 81-99 are very interesting, and it was somewhat disappointing that due to the dynamics of the oligomerization it was not possible to SEC-purify pure dimer or tetramer samples for HDX-MS, in order to deconvolute the cause.

      (2) Related to #1, the authors assign the bimodal HDX behavior to EX1 mechanism, but this is not necessarily (and unlikely) true based on the limited time points. The authors also state that it originates from the heterogeneity of the sample: "a mixture of states" which could reflect the mixture of oligomerization states. The authors should be careful assigning EX1 mechanism unless there are compelling results to support it.

      We apologize for the unfortunate phrasing. It was not our intention to imply that the bimodality is due to true EX1 kinetics. See the above answer. The mention of EX1 has been removed from the discussion text.

      (3) The deuterium uptake for peptide 118-126 is very small (~1Da) compared to the length of the peptide. The change in deuterium uptake (<0.25Da) from dATP is very small; the authors should proceed with caution when presenting interpretations of such small differences.

      We agree with the reviewer that extra caution should be taken when dealing with such a small difference. However, the 118-126 peptide has been significance tested in both HDExaminer and Deuteros 2.0, and we also observed this for more than one run. The difference in uptake is small but increases to significance at the longer labelling times. The proximity to the NxN flap makes it interesting in context of an allosteric conformational change. i.e the dynamics of the NxN might be too fast so we can only see some secondary effects. We would like to keep the data  in Figure 10 for reasons of transparency. In essence this is similar to the observed bimodality mentioned above: we cannot fully explain the observation but present the data as it was observed.

      (4) On p. 22, the authors should consider revising the following statement: "confirming dATP binding to the s-site." Even though the HDX data are most compelling for the protection of peptides 178-204 and 330-348 that are adjacent to the beta-hairpin at the s-site, these data cannot "confirm" a binding site for a small molecule, such as dATP.

      We appreciate that the reviewer has pointed out that the statement can be misleading, and we agree that the binding site of small molecules can’t be confirmed based solely on HDX data. The sentence reformulated to clarify that the binding site was confirmed based on the combined evidence of HDX data and the previously presented biochemical and structural data on the s-site.

    2. eLife assessment

      This study advances our understanding of the allosteric regulation of anaerobic ribonucleotide reductases (RNRs) by nucleotides, providing valuable new structural insight into class III RNRs containing ATP cones. The cryo-EM structural characterization of the system is solid, but some open questions remain about the interpretation of activity/binding assays and the HDX-MS results that have been newly incorporated compared to a previous version. The work will be of interest to biochemists and structural biologists working on ribonucleotide reductases and other allosterically regulated enzymes.

    3. Reviewer #3 (Public Review):

      The manuscript by Bimai et al describes a structural and functional characterization of an anaerobic ribonucleotide reductase (RNR) enzyme from the human microbe, P. copri. More specifically, the authors aimed to characterize the mechanism by how (d)ATP modulates nucleotide reduction in this anaerobic RNR, using a combination of enzyme kinetics, binding thermodynamics, and cryo-EM structural determination, complemented by hydrogen-deuterium exchange (HDX). One of the principal findings of this paper is the ordering of a NxN 'flap' in the presence of ATP that promotes RNR catalysis and the disordering (or increased protein dynamics) of both this flap and the glycyl radical domain (GRD) when the inhibitory effector, dATP, binds. The latter is correlated with a loss of substrate binding, which is the likely mechanism for dATP inhibition. It is important to note that the GRD is remote (>30 Ang) from the binding site of the dATP molecule, suggesting long-range communication of the structural (dis)ordering. The authors also present evidence for a shift in oligomerization in the presence of dATP. The work does provide evidence for new insights/views into the subtle differences of nucleotide modulation (allostery) of RNR, in a class III system, through long-range interactions.

      The strengths of the work are the impressive, in-depth structural analysis of the various regulated forms of PcRNR by (d)ATP using cryo-EM. The authors present seven different models in total, with striking differences in oligomerization and (dis)ordering of select structural features, including the GRD that is integral to catalysis. The authors present several, complementary biochemical experiments (ITC, MST, EPR, kinetics) aimed at resolving the binding and regulatory mechanism of the enzyme by various nucleotides. The authors present a good breadth of the literature in which the focus of allosteric regulation of RNRs has been on the aerobic orthologues.

      The addition of hydrogen-deuterium exchange mass spectrometry (HDX-MS) complements the results originating from cryo-EM data. Most notable, is the observation of the enhanced exchange (albeit quite subtle) of the GRD domain in the presence of dATP that matches the loss of structural information in this region in the cryo-EM data. The most pronounced and compelling HDX results are seen in the form of dATP-induced protection of peptides immediately adjacent to the b-hairpin at the s-site, where dATP is expected to bind based on cryo-EM. It is clear that the presence of dATP increases the rigidity of this region.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Valk and Engert et al. examined the potential relations between three different mental training modules, hippocampal structure and functional connectivity, and cortisol levels (stress) over a 9-month period. They found that among the three types of mental training: Presence (attention and introspective awareness), Affect (socio-emotional - compassion and prosocial motivation), and Perspective (socio-cognitive - metacognition and perspective taking) modules; Affect training most robustly related to changes in hippocampal structure and function - specifically, CA1-3 subfields of the hippocampus. Moreover, change in intrinsic functional connectivity related to changes in diurnal cortisol release and long-term cortisol exposure. These changes are proposed to result from a combination of factors, which is supported by multivariate analyses showing changes across subfields and training content relate to cortisol changes.

      The authors demonstrate that mindfulness training programs are a potential avenue for stress interventions that impact hippocampal structure and cortisol, providing a promising approach to improve health. The data contribute to the literature on plasticity of hippocampal subfields during adulthood, the impact of mental training interventions on the brain, and the link between CA1-3 and both short- and long-term stress changes.

      The authors thoughtfully approached the study of hippocampal subfields, utilizing a method designed for T1w images that outperformed Freesurfer 5.3 and that produced comparable results to an earlier version of ASHS. The authors note the limitations of their approaches and provide detailed information on the data used and analyses conducted. The results provide a strong basis from which future studies can expand using computational approaches or more fine-grained investigations of the impact of mindfulness training on cortisol levels and the hippocampus.

      We thank the Reviewer for the positive re-evaluation and summary of our findings and work. We made additional change as suggested and hope this clarified any open points.

      I have a few additional suggestions. Clarifying the language around the multivariate results and the impact across subfields and training modules would be helpful. 

      We are happy to provide further clarifications with respect to the multivariate results and the impact of training on subfields.

      The multivariate analyses served as a final step to explore any potential connections between training modules and hippocampal subfields, beyond just the link between CA1-3 and the Affect Module. These additional analyses were suggested by the Reviewers, and we, as authors, agreed that taking a broader view of how different parts of the hippocampus interact with overall changes can provide valuable insights into the relationship between mental training, cortisol fluctuations, and changes in CA1-3 subfields.

      We employed a multivariate partial least squares method, which aims to identify the directions in the predictor space that account for the most variance in changes observed, by creating latent variables. Initially, we investigated whether there was a general connection between CA1-3 subfields and cortisol changes, regardless of which training module produced these effects. Our findings confirmed a consistent relationship across all three training modules, indicating a strong association between cortisol changes, particularly markers such as AUC and slope change, and alterations in CA1-3 structure and functional connectivity. We explored a model incorporating changes across all hippocampal subfields and stress markers across different modules. In the right hemisphere, changes in the volume of the CA1-3 subfield were more strongly associated with stress markers, compared to other subfields. However, this association was less pronounced in the left hemisphere.

      Our multivariate approach captured fluctuations across subfields and modules beyond group-level associations, leading to a more nuanced interpretation. While the univariate analysis of module-specific changes in volume and associations within the Affect Module may offer a straightforward interpretation, as they coincide with increases in CA1-3 volume, the multivariate analysis also accounts for individual-level changes not observed at the group level using a data-driven approach. Overall these findings are in line with the group-level observations, yet provide nuance on specificity.

      We clarified these considerations further in the manuscript;

      Abstract:

      “Notably, using a multivariate approach, we found that other subfields that did not show group-level changes also contributed to changes in cortisol levels.”

      Results:

      “We employed a multivariate partial least squares method, which aims to identify the directions in the predictor space that account for the most variance in changes observed, by creating latent variables. Initially, we investigated whether there was a general connection between CA1-3 subfields and cortisol changes, regardless of which training module produced these effects.”

      Discussion:

      “Finally, through conducting multivariate analysis, we once more noticed associations between changes in CA1-3 volume and functional adaptability and alterations in stress levels, particularly prominent within the Affect Module. Integrating all subfields into a unified model highlighted a distinct significance of CA1-3, although for the left hemisphere, we observed a more diverse range of contributions across subfields. In summary, we establish a connection between a socio-emotional behavioral intervention, shifts in hippocampal subfield structure and function, and decreases in cortisol levels among healthy adults.

      Although the univariate examination of changes specific to modules in volume and connections within the Affect Module presents how changes in cortisol align with group-level rises in CA1-3 volume, the multivariate analysis extended this observation through considering individual-level alterations not discernible at the group level through a data-driven method. These results generally corresponded with observations at the group level but offer additional insights into specificity, and hint at system-level alterations.”

    2. eLife assessment

      This important work examines the potential utility of socio-emotional and socio-cognitive mental training on hippocampal subfield structure and function, and cortisol levels. The authors provide convincing evidence that CA1-3 volume is sensitive to socio-emotional training, with changes related to function plasticity and cortisol levels. Further, the authors provide evidence of change across all subfields and training modules related to stress.

    3. Reviewer #1 (Public Review):

      Valk and Engert et al. examined the potential relations between three different mental training modules, hippocampal structure and functional connectivity, and cortisol levels (stress) over a 9-month period. They found that among the three types of mental training: Presence (attention and introspective awareness), Affect (socio-emotional - compassion and prosocial motivation), and Perspective (socio-cognitive - metacognition and perspective taking) modules; Affect training most robustly related to changes in hippocampal structure and function - specifically, CA1-3 subfields of the hippocampus. Moreover, change in intrinsic functional connectivity related to changes in diurnal cortisol release and long-term cortisol exposure. These changes are proposed to result from a combination of factors, which is supported by multivariate analyses showing changes across subfields and training content relate to cortisol changes.

      The authors demonstrate that mindfulness training programs are a potential avenue for stress interventions that impact hippocampal structure and cortisol, providing a promising approach to improve health. The data contribute to the literature on plasticity of hippocampal subfields during adulthood, the impact of mental training interventions on the brain, and the link between CA1-3 and both short- and long-term stress changes.

      The authors thoughtfully approached the study of hippocampal subfields, utilizing a method designed for T1w images that outperformed Freesurfer 5.3 and that produced comparable results to an earlier version of ASHS. The authors note the limitations of their approaches and provide detailed information on the data used and analyses conducted. The results provide a strong basis from which future studies can expand using computational approaches or more fine-grained investigations of the impact of mindfulness training on cortisol levels and the hippocampus.

    1. Reviewer #3 (Public Review):

      Summary:

      This study profiled the single-cell transcriptome of human spermatogenesis and provided many potential molecular markers for developing testicular puncture-specific marker kits for NOA patients.

      Strengths:

      Perform single-cell RNA sequencing (scRNA-seq) and single-cell assay for transposase-accessible chromatin sequencing (scATAC-seq) on testicular tissues from two OA patients and three NOA patients.

      Weaknesses:

      Most results are analytical and lack specific experiments to support these analytical results and hypotheses.

    1. eLife assessment

      This manuscript provides an important advance in our understanding of the molecular events that promote osteoclast fusion. Compelling data support the conclusion that an oxidized form of the ubiquitous protein La promotes osteoclast fusion following enrichment at the cell surface of osteoclast progenitors. These data improve our understanding of the processes that regulate bone resorption and will be of broad interest to researchers in the fields of cell biology and musculoskeletal physiology.

    2. Reviewer #1 (Public Review):

      In this manuscript, Leikina et al. investigate the role of redox changes in the ubiquitous protein La in the promotion of osteoclast fusion. In a recently published manuscript, the investigators found that osteoclast multinucleation and resorptive activity are regulated by a de-phosphorylated and proteolytically cleaved form of the La protein that is present on the cell surface of differentiating osteoclasts. In the present work, the authors build upon these findings to determine the physiologic signals that regulate La trafficking to the cell membrane and ultimately, the ability of this protein to promote fusion. Building upon other published studies that show (1) that intracellular redox signaling can elicit changes in the confirmation and localization of La, and (2) that osteoclast formation is dependent on ROS signaling, the authors hypothesize that oxidation of La in response to intracellular ROS underlies the re-localization of La to the cell membrane and that this is necessary for its pro-fusion activity. The authors test this hypothesis in a rigorous manner using antioxidant treatments, recombinant La protein, and modification of cysteine residues predicted to be key sites of oxidation. Osteoclast fusion is then monitored in each condition using fluorescence microscopy. These data strongly support the conclusion that oxidized La is de-phosphorylated, increases in abundance at the cell surface of differentiating osteoclasts, and promotes cell-cell fusion. A strength of this manuscript is the use of multiple complementary approaches to test the hypothesis, especially the use of Cys mutant forms of La to directly tie the observed phenotypes to changes in residues that are key targets for oxidation. The manuscript is also well-written and describes a clearly articulated hypothesis based on a precise summation of the existing literature. The findings of this manuscript will be of interest to researchers in the field of bone biology, but also more generally to cell biologists. The data in this manuscript may also lead to future studies that target La for bone diseases in which there is increased osteoclast activity. The weaknesses of the manuscript are minor and predominantly relate to data presentation choices. These weaknesses do not detract from the overall conclusions of the study.

    3. Reviewer #2 (Public Review):

      Summary:

      Bone resorption by osteoclasts plays an important role in bone modeling and homeostasis. The multinucleated mature osteoclasts have higher bone-resorbing capacity than their mononuclear precursors. The previous work by authors has identified that increased cell-surface level of La protein promotes the fusion of mononuclear osteoclast precursor cells to form fully active multinucleated osteoclasts. In the present study, the authors further provided convincing data obtained from cellular and biochemical experiments to demonstrate that the nuclear-localized La protein where it regulates RNA metabolism was oxidized by redox signaling during osteoclast differentiation and the modified La protein was translocated to the osteoclast surface where it associated with other proteins and phospholipids to trigger cell-cell fusion process. The work provides novel mechanistic insights into osteoclast biology and provides a potential therapeutic target to suppress excessive bone resorption in metabolic bone diseases such as osteoporosis and arthritis.

      Strengths:

      Increased intracellular ROS induced by osteoclast differentiation cytokine RANKL has been widely studied in enhancing RANKL signaling during osteoclast differentiation. The work provides novel evidence that redox signaling can post-translationally modify proteins to alter the translocation and functions of critical regulators in the late stage of osteoclastogenesis. The results and conclusions are mostly supported by the convincing cellular and biochemical assays,

      Weaknesses:

      The lack of in vivo studies in animal models of bone diseases such as postmenopausal osteoporosis, inflammatory arthritis, and osteoarthritis reduces the translational potential of this work.

    1. eLife assessment

      This important study employs an innovative genetic selection-based approach to identify short peptide sequences that target bacterial proteins for degradation. Using random mutagenesis they identified 5 amino acid long "degrons" that target the toxin VapC for degradation permitting survival. They provide compelling data that degrons ending in Ala-Ala are selectively recognized by the ClpXP protease and identify the sequence FKLVA as a particularly significant target. As a whole, there is enthusiasm about the author's findings, although there are also some improvements that could be made to increase the clarity and impact, mostly in the form of revisions to the text.

    2. Reviewer #1 (Public Review):

      Summary:

      In this manuscript by Beardslee and Schmitz, the authors undertook a screen for potential degrons - short peptide sequences at the C-terminus that would target the toxin VapC for degradation. The authors randomly mutagenized 5 amino acids appended to the C-terminus of VapC and transformed this library into E. coli to look for surviving cells when the VapC gene was expressed. The authors found an enrichment for tags ending Ala-Ala, and found that this enrichment was dependent on the presence of the ClpXP protease, since this sequence was not similarly enriched in a mutant lacking this protease. Moreover, the authors identify the sequence FKLVA as the tag with the highest fold enrichment in the screen and confirm that GFP tagged with this sequence is degraded by ClpXP with similar kinetics to GFP tagged with the ssrA-derived tag.

      Strengths:

      This study has two major implications for understanding the nature of degrons in E. coli. First, peptides ending Ala-Ala, and especially degrons resembling the ssrA degron are likely the most degradation-promoting sequences in E. coli. Second, these findings suggest that ClpXP is the most central protease, at least for this particular protein with a randomized C-terminus under the particular conditions of this screen. It is also notable that the ribosome quality control protein RqcH tags truncated proteins with an alanine tag in a template-free manner when the large ribosomal subunit is obstructed. Although E. coli doesn't encode RqcH, the utility of alanine-tagging for protein degradation likely extends to other organisms.

      Weaknesses:

      The authors remark and show that mutations that inactivate the VapC protein are enriched potentially more than the proteolysis tags. This is a limitation of the study and the authors have done well to describe this as it will inform future screens. Perhaps using a protein with more intermediate toxicity in future screens would help to prioritize C-terminal mutations instead of toxin-inactivating mutations.

      For clarity, the authors should explain why the NNK structure of the random codons was used. Why is it important that the codon end with a G or T?

      Authors state on page 7 that by determining enrichment of individual tags they can rank the relative Km for proteolysis of the individual tags. This statement is not accurate since the tag could variously impact its association with any of the proteases in the cell. Since Km is specific to each particular protease, these can't be ranked in vivo when all proteases are present.

    3. Reviewer #2 (Public Review):

      Summary:

      The authors studied the sequence determinants of C-terminal tags that govern protein degradation in bacteria. They introduce a new strategy to determine degron sequences: Detox (Degron Enrichment by toxin). This unbiased approach links degron efficiency to cell growth as degrons are C-terminally fused to the toxin VapC, which inhibits protein translation. Selecting for bacterial growth and thus toxin degradation enabled the identification of potent degron derived from a randomized library of pentapeptides. Remarkably, most degrons show sequence similarity to the SsrA-tag, which is fused to incomplete polypeptides at stalled ribosomes by the tmRNA-tagging system. These findings underline the extraordinary efficiency of the SsrA-tag and the ClpXP protease in removing incomplete polypeptides and demonstrate that most proteins are spared from degradation by harboring different C-termini. The introduced method will be highly useful to determine degron sequences in other positions and other bacterial species.

      Strengths:

      The work introduces an innovative and powerful strategy to identify degron sequences in bacteria. The study is well-controlled and results have been thoroughly analyzed. It will now become important to broaden the technology, making it also accessible for more complex degrons.

      Weaknesses:

      The approach is efficient in identifying strong degron sequences that are predominantly recognized by the ClpXP protease. The sequence specificity of other proteolytic systems, however, is not efficiently addressed, pointing to a potential limitation of this technology. The GS-rich linker sequence connecting the degron and the toxin might also impact proteolysis and thus outcome.

    4. Reviewer #3 (Public Review):

      Summary:

      This manuscript by Beardslee and Schmitz reports discoveries made from a genetic screen to identify C-terminal degrons that cause the efficient depletion of a potent toxin, which allowed for a deep assessment of amino acid patterns that promote protein turnover.

      Strengths:

      The key findings are that SsrA-like C termini are a dominant class of efficient degrons and that ClpP (X/A) mediates the turnover in most cases. Moreover, the data provides insight into the importance of residues situated farther into the degron and reveals aspects of the ClpX engagement and processivity process. The manuscript is clearly written and there is ample supporting data for the conclusions drawn. The figures are also informative.

      Weaknesses:

      There are only a few minor suggestions on data interpretation.

      (1) Page 6: It is stated that "We plated cells on media containing 0 - 1% arabinose inducer, and observed that stronger induction of untagged VapC indeed correlates with smaller colony size; ... We conclude that VapC levels have a titratable effect on growth rate."

      In E. coli with intact arabinose import/response systems, sub-saturating levels of arabinose do not generally lower the induction level of the PBAD promoter in each cell; rather, a sub-population of cells becomes induced [PMID: 9223333]. The bulk observation is a reduced expression level, and, in this case, slowed growth, but it seems more likely that the slow growth observed is from the induced cohort dying off as the cultures and colonies develop.

      (2) Page 8: "At 6-hours post-induction,..."

      Because these experiments were enrichments from initial pools of clones, the number of cell divisions is more informative than the hours of outgrowth or culture densities at harvest. It would be helpful if the authors could indicate, or at least estimate, the number of cell divisions. this could then be included in the results or methods section.

      (3) Page 12: "It is possible that these sequences compromise VapC folding and solubility, or mimic inhibitory interactions made by hydrophobic segments of the VapB antitoxin that block VapC activity (43, 59)."

      Later in the manuscript, Lon is presented as a minor player in the overall story, but Lon prefers hydrophobic degrons. Could that hydrophobic class be Lon substrates? (Possibly presented as an additional mechanism here or in the discussion of this class of tags.)

      (4) Page 13: "Arg in the 2nd position was also associated with proteolysis, yet Arg is virtually absent from proteobacterial ssrA sequences."

      The nucleic acid changes required for evolutionary drift from the predominant amino acid codons at this position in proteobacteria to Arg may require moving through several codons that notably impair the performance of the degron. Such a constraint may also be responsible, in part, for the observed conservation.